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NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

1. TECHNICAL FIELD 

The present invention provides novel polynucleotides and proteins encoded by 
5 such polynucleotides, along with uses for these polynucleotides and proteins, for example 
in therapeutic, diagnostic and research methods. 

2. BACKGROUND 

Technology aimed at the discovery of protein factors (including e.g., cytokines, 
10 such as lymphokines, interferons, CSFs, chemokines, and interleukins) has matured 
rapidly over the past decade. The now routine hybridization cloning and expression 
cloning techniques clone novel polynucleotides "directly" in the sense that they rely on 
information directly related to the discovered protein (i.e., partial DNA/amino acid 
sequence of the protein in the case of hybridization cloning; activity of the protein in the 
15 case of expression cloning). More recent "indirect" cloning techniques such as signal 
sequence cloning, which isolates DNA sequences based on the presence of a now 
well-recognized secretory leader sequence motif, as well as various PCR-based or low 
stringency hybridization-based cloning techniques, have advanced the state of the art by 
making available large numbers of DNA/amino acid sequences for proteins that are 
20 known to have biological activity, for example, by virtue of their secreted nature in the 
case of leader sequence cloning, by virtue of their cell or tissue source in the case of 
PCR-based techniques, or by virtue of structural similarity to other genes of known 
biological activity. 

Identified polynucleotide and polypeptide sequences have numerous applications 
25 in, for example, diagnostics, forensics, gene mapping; identification of mutations 
responsible for genetic disorders or other traits, to assess biodiversity, and to produce 
many other types of data and products dependent on DNA and amino acid sequences. 

3. SUMMARY OF THE INVENTION 

30 The compositions of the present invention include novel isolated polypeptides, novel 

isolated polynucleotides encoding such polypeptides, including recombinant DNA 
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molecules, cloned genes or degenerate variants thereof, especially naturally occurring 
variants such as allelic variants, antisense polynucleotide molecules, and antibodies that 
specifically recognize one or more epitopes present on such polypeptides, as well as 
hybridomas producing such antibodies. 
5 The compositions of the present invention additionally include vectors, including 

expression vectors, containing the polynucleotides of the invention, cells genetically 
engineered to contain such polynucleotides and cells genetically engineered to express such 
polynucleotides. 

The present invention relates to a collection or library of at least one novel nucleic 
10 acid sequence assembled from expressed sequence tags (ESTs) isolated mainly by 

sequencing by hybridization (SBH), and in some cases, sequences obtained from one or 
more public databases. The invention relates also to the proteins encoded by such 
polynucleotides, along with therapeutic, diagnostic and research utilities for these 
polynucleotides and proteins. These nucleic acid sequences are designated as SEQ ID NO: 
15 1 - 438 and are provided in the Sequence Listing. In the nucleic acids provided in the 
Sequence listing, A is adenine; C is cytosine; G is guanine; T is thymine; and N is any of 
the four bases. In the amino acids provided in the Sequence Listing, * corresponds to the 
stopcodon. 

The nucleic acid sequences of the present invention also include, nucleic acid 
20 sequences that hybridize to the complement of SEQ ID NO: 1 - 438 under stringent 
hybridization conditions; nucleic acid sequences which are allelic variants or species 
homologues of any of die nucleic acid sequences recited above, or nucleic acid sequences 
that encode a peptide comprising a specific domain or truncation of the peptides encoded by 
SEQ ID NO: 1 - 438. A polynucleotide comprising a nucleotide sequence having at least 
25 90% identity to an identifying sequence of SEQ ID NO: 1 - 438 or a degenerate variant or 
fragment thereof. The identifying sequence can be 100 base pairs in length. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NO: 1-438. The sequence 
information can be a segment of any one of SEQ ID NO: 1 - 438 that uniquely identifies or 
30 represents the sequence information of SEQ ID NO: 1 - 438. 
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A collection as used in this application can be a collection of only one 
polynucleotide. The collection of sequence information or identifying information of each 
sequence can be provided on a nucleic acid array. In one embodiment, segments of 
sequence information is provided on a nucleic acid array to detect the polynucleotide that 
5 contains the segment The array can be designed to detect full-match or mismatch to the 
polynucleotide that contains the segment. The collection can also be provided in a 
computer-readable format. 

This invention also includes the reverse or direct complement of any of the nucleic 
acid sequences recited above; cloning or expression vectors containing the nucleic acid 

10 sequences; and host cells or organisms transformed with these expression vectors. Nucleic 
acid sequences (or their reverse or direct complements) according to the invention have 
numerous applications in a variety of techniques known to those skilled in the art of 
molecular biology, such as use as hybridization probes, use as primers for PCR, use in an 
array, use in computer-readable media, use in sequencing full-length genes, use for 

15 chromosome and gene mapping, use in the recombinant production of protein, and use in 
the generation of anti-sense DNA or RNA, their chemical analogs and the like. 

In a preferred embodiment, the nucleic acid sequences of SEQ ED NO: 1-438 or 
novel segments or parts of the nucleic acids of the invention are used as primers in 
expression assays that are well known in the art. In a particularly preferred embodiment, the 

20 nucleic acid sequences of SEQ ID NO: 1-438 or novel segments or parts of the nucleic acids 
provided herein are used in diagnostics for identifying expressed genes or, as well known in 
the art and exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence 
tags for physical mapping of the human genome. 

The isolated polynucleotides of the invention include, but are not limited to, a 

25 polynucleotide comprising any one of the nucleotide sequences set forth in SEQ ID NO: 1- 
438; a polynucleotide comprising any of the full length protein coding sequences of SEQ ID 
NO: 1-438; and a polynucleotide comprising any of the nucleotide sequences of the mature 
protein coding sequences of SEQ ID NO: 1-438. The polynucleotides of the present 
invention also include, but are not limited to, a polynucleotide that hybridizes under 

30 stringent hybridization conditions to (a) the complement of any one of the nucleotide 

sequences set forth in SEQ ID NO: 1-438; (b) a nucleotide sequence encoding any one of 
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the amino acid sequences set forth in the Sequence listing; (c) a polynucleotide which is an 
allelic variant of any polynucleotides recited above; (d) a polynucleotide which encodes a 
species homolog (e.g. orthologs) of any of the proteins recited above; or (e) a 
polynucleotide that encodes a polypeptide comprising a specific domain or truncation of any 
5 of the polypeptides comprising an amino acid sequence set forth in the Sequence listing. 
The isolated polypeptides of the invention include, but are not limited to, a 
polypeptide comprising any of the amino acid sequences set forth in the Sequence listing; 
or the corresponding full length or mature protein. Polypeptides of the invention also include 
polypeptides with biological activity that are encoded by (a) any of the polynucleotides 

10 having a nucleotide sequence set forth in SEQ ID NO: 1-438; or (b) polynucleotides that 
hybridize to the complement of the polynucleotides of (a) under stringent hybridization 
conditions. Biologically or immunologically active variants of any of the polypeptide 
sequences in the Sequence listing, and "substantial equivalents" thereof (e.g., with at least 
about 65%, 70%, 75%, 80%, 85%, 90%, 95%, 98% or 99% amino acid sequence identity) 

15 that preferably retain biological activity are also contemplated. The polypeptides of the 

invention may be wholly or partially chemically synthesized but are preferably produced by 
recombinant means using the genetically engineered cells (e.g. host cells) of the invention. 

The invention also provides compositions comprising a polypeptide of the 
invention. Polypeptide compositions of the invention may further comprise an acceptable 

20 carrier, such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The invention also provides host cells transformed or transfected with a 
polynucleotide of the invention. 

The invention also relates to methods for producing a polypeptide of the invention 
comprising growing a culture of the host cells of the invention in a suitable culture 

25 medium under conditions permitting expression of the desired polypeptide, and purifying 
the polypeptide from the culture or from the host cells. Preferred embodiments include 
those in which the protein produced by such process is a mature form of the protein. 

Polynucleotides according to the invention have numerous applications in a 
variety of techniques known to those skilled in the art of molecular biology. These 

30 techniques include use as hybridization probes, use as oligomers, or primers, for PCR, 
use for chromosome and gene mapping, use in the recombinant production of protein, 
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and use in generation of anti-sense DNA or RNA, their chemical analogs and the like. 
For example, when the expression of an mRNA is largely restricted to a particular cell or 
tissue type, polynucleotides of the invention can be used as hybridization probes to detect 
the presence of the particular cell or tissue mRNA in a sample using, e.g., in situ 
5 hybridization. 

In other exemplary embodiments, the polynucleotides are used in diagnostics as 
expressed sequence tags for identifying expressed genes or, as well known in the art and 
exemplified by Vollrath et al., Science 258:52-59 (1992), as expressed sequence tags for 
physical mapping of the human genome. 

10 The polypeptides according to the invention can be used in a variety of 

conventional procedures and methods that are currently applied to other proteins. For 
example, a polypeptide of the invention can be used to generate an antibody that 
specifically binds the polypeptide. Such antibodies, particularly monoclonal antibodies, 
are useful for detecting or quantitating the polypeptide in tissue. The polypeptides of the 

15 invention can also be used as molecular weight markers, and as a food supplement. 

Methods are also provided for preventing, treating, or ameliorating a medical 
condition which comprises the step of administering to a mammalian subject a 
therapeutically effective amount of a composition comprising a polypeptide of the 
present invention and a pharmaceutically acceptable carrier. 

20 In particular, the polypeptides and polynucleotides of the invention can be 

utilized, for example, in methods for the prevention and/or treatment of disorders 
involving aberrant protein expression or biological activity. 

The present invention further relates to methods for detecting the presence of the 
polynucleotides or polypeptides of the invention in a sample. Such methods can, for 

25 example, be utilized as part of prognostic and diagnostic evaluation of disorders as 
recited herein and for the identification of subjects exhibiting a predisposition to such 
conditions. The invention provides a method for detecting the polynucleotides of the 
invention in a sample, comprising contacting the sample with a compound that binds to 
and forms a complex with the polynucleotide of interest for a period sufficient to form 

30 the complex and under conditions sufficient to form a complex and detecting the complex 
such that if a complex is detected, the polynucleotide of interest is detected. The 
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invention also provides a method for detecting the polypeptides of the invention in a 
sample comprising contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex and detecting the formation of the complex such that if a complex is formed, the 
5 polypeptide is detected. 

The invention also provides kits comprising polynucleotide probes and/or 
monoclonal antibodies, and optionally quantitative standards, for carrying out methods of 
the invention. Furthermore, the invention provides methods for evaluating the efficacy of 
drugs, and monitoring the progress of patients, involved in clinical trials for the treatment 

10 of disorders as recited above. 

The invention also provides methods for the identification of compounds that 
modulate (i.e., increase or decrease) the expression or activity of the polynucleotides 
and/or polypeptides of the invention. Such methods can be utilized, for example, for the 
identification of compounds that can ameliorate symptoms of disorders as recited herein. 

15 Such methods can include, but are not limited to, assays for identifying compounds and 
other substances that interact with (e.g., bind to) the polypeptides of the invention. The 
invention provides a method for identifying a compound that binds to the polypeptides of 
the invention comprising contacting the compound with a polypeptide of the invention in 
a cell for a time sufficient to form a polypeptide/compound complex, wherein the 

20 complex drives expression of a reporter gene sequence in the cell; and detecting the 

complex by detecting the reporter gene sequence expression such that if expression of the 
reporter gene is detected the compound the binds to a polypeptide of the invention is 
identified. 

The methods of the invention also provides methods for treatment which involve 
25 the administration of the polynucleotides or polypeptides of the invention to individuals 
exhibiting symptoms or tendencies. In addition, the invention encompasses methods for 
treating diseases or disorders as recited herein comprising administering compounds and 
other substances that modulate the overall activity of the target gene products. 
Compounds and other substances can effect such modulation either on the level of target 
30 gene/protein expression or target protein activity. 
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The polypeptides of the present invention and the polynucleotides encoding them 
are also useful for the same functions known to one of skill in the art as the polypeptides 
and polynucleotides to which they have homology (set forth in Table 2); for which they 
have a signature region (as set forth in Table 3); or for which they have homology to a 
5 gene family (as set forth in Table 4). If no homology is set forth for a sequence, then the 
polypeptides and polynucleotides of the present invention are useful for a variety of 
» applications, as described herein, including use in arrays for detection. 

4. DETAILED DESCRIPTION OF THE INVENTION 

10 

4.1 DEFINITIONS 

It must be noted that as used herein and in the appended claims, the singular 
forms "a", "an" and "the" include plural references unless the context clearly dictates 
otherwise. 

15 The term "active" refers to those forms of the polypeptide which retain the 

biologic and/or immunologic activities of any naturally occurring polypeptide. According 
to the invention, the terms "biologically active" or "biological activity" refer to a protein 
or peptide having structural, regulatory or biochemical functions of a naturally occurring 
molecule. Likewise "immunologically active" or "immunological activity" refers to the 

20 capability of the natural, recombinant or synthetic polypeptide to induce a specific 
immune response in appropriate animals or cells and to bind with specific antibodies. 

The term "activated cells" as used in this application are those cells which are 
engaged in extracellular or intracellular membrane trafficking, including the export of 
secretory or enzymatic molecules as part of a normal or disease process. 

25 The terms "complementary" or "complementarity" refer to the natural binding of 

polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to the 
complementary sequence 3'-TCA-5\ Complementarity between two single-stranded 
molecules may be "partial" such that only some of the nucleic acids bind or it may be 
"complete" such that total complementarity exists between the single stranded molecules. 

30 The degree of complementarity between the nucleic acid strands has significant effects on 
the efficiency and strength of the hybridization between the nucleic acid strands. 
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The term "embryonic stem cells (ES)" refers to a cell that can give rise to many 
differentiated cell types in an embryo or an adult, including the germ cells. The term 
"germ line stem cells (GSCs)" refers to stem cells derived from primordial stem cells that 
provide a steady and continuous source of germ cells for the production of gametes. The 
5 term "primordial germ cells (PGCs)" refers to a small population of cells set aside from 
other cell lineages particularly from the yolk sac, mesenteries, or gonadal ridges during 
embryogenesis that have the potential to differentiate into germ cells and other cells, 
PGCs are the source from which GSCs and ES cells are derived The PGCs, the GSCs 
and the ES cells are capable of self-renewal. Thus these cells not only populate the germ 

10 line and give rise to a plurality of terminally differentiated cells that comprise the adult 
specialized organs, but are able to regenerate themselves. 

The term "expression modulating fragment," EMF, means a series of nucleotides 
which modulates the expression of an operably linked ORF or another EMF. 

As used herein, a sequence is said to "modulate the expression of an operably 

15 linked sequence" when the expression of the sequence is altered by the presence of the 
EMF. EMFs include, but are not limited to, promoters, and promoter modulating 
sequences (inducible elements). One class of EMFs are nucleic acid fragments which 
induce the expression of an operably linked ORF in response to a specific regulatory 
factor or physiological event. 

20 The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or 

"oligonculeotide" are used interchangeably and refer to a heteropolymer of nucleotides or 
the sequence of these nucleotides. These phrases also refer to DNA or RNA of genomic 
or synthetic origin which may be single-stranded or double-stranded and may represent 
the sense or the antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or 

25 RNA-like material. In the sequences herein A is adenine, C is cytosine, T is thymine, G 
is guanine and N is A, C, G or T (U). It is contemplated that where the polynucleotide is 
RNA, the T (thymine) in the sequences provided herein is substituted with U (uracil). 
Generally, nucleic acid segments provided by this invention may be assembled from 
fragments of the genome and short oligonucleotide linkers, or from a series of 

30 oligonucleotides, or from individual nucleotides, to provide a synthetic nucleic acid 
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which is capable of being expressed in a recombinant transcriptional unit comprising 
regulatory elements derived from a microbial or viral operon, or a eukaryotic gene. 

The terms "oligonucleotide fragment" or a "polynucleotide fragment", "portion," 
or "segment" or "probe" or "primer" are used interchangeably and refer to a sequence of 
5 nucleotide residues which are at least about 5 nucleotides, more preferably at least about 
7 nucleotides, more preferably at least about 9 nucleotides, more preferably at least about 
11 nucleotides and most preferably at least about 17 nucleotides. The fragment is 
preferably less than about 500 nucleotides, preferably less than about 200 nucleotides, 
more preferably less than about 100 nucleotides, more preferably less than about 50 

10 nucleotides and most preferably less than 30 nucleotides. Preferably the probe is from 
about 6 nucleotides to about 200 nucleotides, preferably from about 15 to about 50 
nucleotides, more preferably from about 17 to 30 nucleotides and most preferably from 
about 20 to 25 nucleotides. Preferably the fragments can be used in polymerase chain 
reaction (PCR), various hybridization procedures or microarray procedures to identify or 

15 amplify identical or related parts of mRNA or DNA molecules. A fragment or segment 
may uniquely identify each polynucleotide sequence of the present invention. Preferably 
the fragment comprises a sequence substantially similar to any one of SEQ ID NOs:l- 
438. 

Probes may, for example, be used to determine whether specific mRNA 
20 molecules are present in a cell or tissue or to isolate similar nucleic acid sequences from 
chromosomal DNA as described by Walsh et aL (Walsh, P S. et al., 1992, PCR Methods 
Appl 1:241-250), They may be labeled by nick translation, Klenow fill-in reaction, PCR, 
or other methods well known in the art Probes of the present invention, their preparation 
and/or labeling are elaborated in Sambrook, J. et al., 1989, Molecular Cloning: A 
25 Laboratory Manual, Cold Spring Harbor Laboratory, NY; or Ausubel, F.M. et al., 1989, 
Current Protocols in Molecular Biology, John Wiley & Sons, New York NY, both of 
which are incorporated herein by reference in their entirety. 

The nucleic acid sequences of the present invention also include the sequence 
information from the nucleic acid sequences of SEQ ID NOs: 1-438. The. sequence 
30 information can be a segment of any one of SEQ ID NOs: 1-438 that uniquely identifies 
or represents the sequence information of that sequence of SEQ ID NO: 1-438. One such 
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segment can be a twenty-mer nucleic acid sequence because the probability that a twenty- 
mer is fully matched in the human genome is 1 in 300. In the human genome, there are 
three billion base pairs in one set of chromosomes. Because 4 20 possible twenty-mers 
exist, there are 300 times more twenty-mers than there are base pairs in a set of human 
5 chromosomes. Using the same analysis, the probability for a seventeen-mer to be fully 
matched in the human genome is approximately 1 in 5. When these segments are used in 
arrays for expression studies, fifteen-mer segments can be used. The probability that the 
fifteen-mer is fully matched in die expressed sequences is also approximately one in five 
because expressed sequences comprise less than approximately 5% of the entire genome 
10 sequence. 

Similarly, when using sequence information for detecting a single mismatch, a 
segment can be a twenty-five mer. The probability that the twenty-five mer would appear in 
a human genome with a single mismatch is calculated by multiplying the probability for a 
full match (1+4 25 ) times the increased probability for mismatch at each nucleotide position 

15 (3 x 25). The probability that an eighteen mer with a single mismatch can be detected in an 
array for expression studies is approximately one in five. The probability that a twenty-mer 
with a single mismatch can be detected in a human genome is approximately one in five. 

The term "open reading frame," ORF, means a series of nucleotide triplets coding 
for amino acids without any termination codons and is a sequence translatable into 

20 protein. 

The terms "operably linked" or "operably associated" refer to functionally related 
nucleic acid sequences. For example, a promoter is operably associated or operably 
linked with a coding sequence if the promoter controls the transcription of the coding 
sequence. While operably linked nucleic acid sequences can be contiguous and in the 
25 same reading frame, certain genetic elements e.g. repressor genes are not contiguously 
linked to the coding sequence but still control transcription/translation of the coding 
sequence. 

The term "pluripotent" refers to the capability of a cell to differentiate into a 
number of differentiated cell types that are present in an adult organism. A pluripotent 
30 cell is restricted in its differentiation capability in comparison to a totipotent cell. 
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The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an 
oligopeptide, peptide, polypeptide or protein sequence or fragment thereof and to 
naturally occurring or synthetic molecules. A polypeptide "fragment," "portion," or 
"segment" is a stretch of amino acid residues of at least about 5 amino acids, preferably at 
5 least about 7 amino acids, more preferably at least about 9 amino acids and most 

preferably at least about 17 or more amino acids. The peptide preferably is not greater 
than about 200 amino acids, more preferably less than 150 amino acids and most 
preferably less than 100 amino acids. Preferably the peptide is from about 5 to about 200 
amino acids. To be active, any polypeptide must have sufficient length to display 

10 biological and/or immunological activity. 

The term "naturally occurring polypeptide" refers to polypeptides produced by 
cells that have not been genetically engineered and specifically contemplates various 
polypeptides arising from post-translational modifications of the polypeptide including, 
but not limited to, acetylation, carboxylation, glycosylation, phosphorylation, lipidation 

15 and acylation. 

The term "translated protein coding portion" means a sequence which encodes for 
the full length protein which may include any leader sequence or any processing 
sequence. 

The term "mature protein coding sequence" means a sequence which encodes a 
20 peptide or protein without a signal or leader sequence. The "mature protein portion" 

means that portion of the protein which does not include a signal or leader sequence. The 
peptide may have been produced by processing in the cell which removes any 
leader/signal sequence. The mature protein portion may or may not include the initial 
methionine residue. The methionine residue may be removed from the protein during 
25 processing in the cell. The peptide may be produced synthetically or the protein may 
have been produced using a polynucleotide only encoding for the mature protein coding 
sequence. 

The term "derivative" refers to polypeptides chemically modified by such 
techniques as ubiquitination, labeling (e.g., with radionuclides or various enzymes), 
30 covalent polymer attachment such as pegylation (derealization with polyethylene glycol) 
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and insertion or substitution by chemical synthesis of amino acids such as ornithine, 
which do not normally occur in human proteins. 

The term "variant"(or "analog") refers to any polypeptide differing from naturally 
occurring polypeptides by amino acid insertions, deletions, and substitutions, created 
5 using, e g., recombinant DNA techniques. Guidance in determining which amino acid 
residues may be replaced, added or deleted without abolishing activities of interest, may 
be found by comparing the sequence of the particular polypeptide with that of 
homologous peptides and minimizing the number of amino acid sequence changes made 
in regions of high homology (conserved regions) or by replacing amino acids with 

10 consensus sequence. 

Alternatively, recombinant variants encoding these same or similar polypeptides 
may be synthesized or selected by making use of the "redundancy" in the genetic code. 
Various codon substitutions, such as the silent changes which produce various restriction 
sites, may be introduced to optimize cloning into a piasmid or viral vector or expression 

15 in a particular prokaryotic or eukaryotic system. Mutations in the polynucleotide 

sequence may be reflected in the polypeptide or domains of other peptides added to the 
polypeptide to modify the properties of any part of the polypeptide, to change 
characteristics such as ligand-binding affinities, interchain affinities, or 
degradation/turnover rate. 

20 Preferably, amino acid "substitutions" are the result of replacing one amino acid 

with another amino acid having similar structural and/or chemical properties, le., 
conservative amino acid replacements. "Conservative" amino acid substitutions may be 
made on the basis of similarity in polarity, charge, solubility, hydrophobicity, 
hydrophilicity, and/or the amphipathic nature of the residues involved. For example, 

25 nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, valine, proline, 
phenylalanine, tryptophan, and methionine; polar neutral amino acids include glycine, 
serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively charged (basic) 
amino acids include arginine, lysine, and histidine; and negatively charged (acidic) amino 
acids include aspartic acid and glutamic acid. "Insertions" or "deletions" are preferably 

30 in the range of about 1 to 20 amino acids, more preferably 1 to 10 amino acids. The 

variation allowed may be experimentally determined by systematically making insertions, 
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deletions, or substitutions of amino acids in a polypeptide molecule using recombinant 
DNA techniques and assaying the resulting recombinant variants for activity. 

Alternatively, where alteration of function is desired, insertions, deletions or 
non-conservative alterations can be engineered to produce altered polypeptides. Such 
5 alterations can, for example, alter one or more of the biological functions or biochemical 
characteristics of the polypeptides of the invention. For example, such alterations may 
change polypeptide characteristics such as ligand-binding affinities, interchain affinities, 
or degradation/turnover rate. Further, such alterations can be selected so as to generate 
polypeptides that are better suited for expression, scale up and the like in the host cells 

10 chosen for expression. For example, cysteine residues can be deleted or substituted with 
another amino acid residue in order to eliminate disulfide bridges. 

The terms "purified" or "substantially purified" as used herein denotes that the 
indicated nucleic acid or polypeptide is present in the substantial absence of other 
biological macromolecules, e.g., polynucleotides, proteins, and the like. In one 

15 embodiment, the polynucleotide or polypeptide is purified such that it constitutes at least 
95% by weight, more preferably at least 99% by weight, of the indicated biological 
macromolecules present (but water, buffers, and other small molecules, especially 
molecules having a molecular weight of less than 1000 daltons, can be present). 
The term "isolated" as used herein refers to a nucleic acid or polypeptide 

20 separated from at least one other component (e.g., nucleic acid or polypeptide) present 
with the nucleic acid or polypeptide in its natural source. In one embodiment, the nucleic 
acid or polypeptide is found in the presence of (if anything) only a solvent, buffer, ion, or 
other component normally present in a solution of the same. Hie terms "isolated" and 
"purified" do not encompass nucleic acids or polypeptides present in their natural source. 

25 The term "recombinant," when used herein to refer to a polypeptide or protein, 

means that a polypeptide or protein is derived from recombinant (e.g., microbial, insect, 
or mammalian) expression systems. "Microbial" refers to recombinant polypeptides or 
proteins made in bacterial or fungal (e.g., yeast) expression systems. As a product, 
"recombinant microbial" defines a polypeptide or protein essentially free of native 

30 endogenous substances and unaccompanied by associated native glycosylation. 

Polypeptides or proteins expressed in most bacterial cultures, e.g., E. coli, will be free of 
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glycosylation modifications; polypeptides or proteins expressed in yeast will have a 
glycosylation pattern in general different from those expressed in mammalian cells. 

The term "recombinant expression vehicle or vector" refers to a plasmid or phage 
or virus or vector, for expressing a polypeptide from a DNA (RNA) sequence. An 
5 expression vehicle can comprise a transcriptional unit comprising an assembly of (1) a 
genetic element or elements having a regulatory role in gene expression, for example, 
promoters or enhancers, (2) a structural or coding sequence which is transcribed into 
mRNA and translated into protein, and (3) appropriate transcription initiation and 
termination sequences. Structural units intended for use in yeast or eukaryotic expression 

10 systems preferably include a leader sequence enabling extracellular secretion of 

translated protein by a host cell. Alternatively, where recombinant protein is expressed 
without a leader or transport sequence, it may include an amino terminal methionine 
residue. This residue may or may not be subsequently cleaved from the expressed 
recombinant protein to provide a final product. 

15 The term "recombinant expression system" means host cells which have stably 

integrated a recombinant transcriptional unit into chromosomal DNA or carry the 
recombinant transcriptional unit extrachromosomally. Recombinant expression systems 
as defined herein will express heterologous polypeptides or proteins upon induction of 
the regulatory elements linked to the DNA segment or synthetic gene to be expressed. 

20 This term also means host cells which have stably integrated a recombinant genetic 

element or elements having a regulatory role in gene expression, for example, promoters 
or enhancers. Recombinant expression systems as defined herein will express 
polypeptides or proteins endogenous to the cell upon induction of the regulatory elements 
linked to the endogenous DNA segment or gene to be expressed. The cells can be 

25 prokaryotic or eukaryotic. 

The term "secreted" includes a protein that is transported across or through a 
membrane, including transport as a result of signal sequences in its amino acid sequence 
when it is expressed in a suitable host cell. "Secreted" proteins include without limitation 
proteins secreted wholly (e.g., soluble proteins) or partially (£.#., receptors) from the cell 

30 in which they are expressed. "Secreted" proteins also include without limitation proteins 
that are transported across the membrane of the endoplasmic reticulum. "Secreted" 
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proteins are also intended to include proteins containing non-typical signal sequences 
(e.g. Interleukin-1 Beta, see Krasney, P.A. and Young, P.R. (1992) Cytokine 4(2): 134 
-143) and factors released from damaged cells (e.g. Interleukin-1 Receptor Antagonist, 
see Arend, WJP. et. al. (1998) Annu. Rev. Immunol. 16:27-55) 
5 Where desired, an expression vector may be designed to contain a "signal or 

leader sequence" which will direct the polypeptide through the membrane of a cell. Such 
a sequence may be naturally present on the polypeptides of the present invention or 
provided from heterologous protein sources by recombinant DNA techniques. 

The term "stringent" is used to refer to conditions that are commonly understood 

10 in the art as stringent. Stringent conditions can include highly stringent conditions (i.e., 
hybridization to filter-bound DNA in 0.5 M NaHP0 4 , 7% sodium dodecyl sulfate (SDS), 
1 mM EDTA at 65°C, and washing in 0.1X SSC/0.1% SDS at 68°C), and moderately 
stringent conditions (i.e., washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary 
hybridization conditions are described herein in the examples. 

15 In instances of hybridization of deoxyoligonucleotides, additional exemplary 

stringent hybridization conditions include washing in 6X SSC/0.05% sodium 
pyrophosphate at 37°C (for 14-base oligonucleotides), 48°C (for 17-base oligos), 55°C 
(for 20-base oligonucleotides), and 60°C (for 23-base oligonucleotides). 

As used herein, "substantially equivalent" or "substantially similar" can refer both 

20 to nucleotide and amino acid sequences, for example a mutant sequence, that varies from 
a reference sequence by one or more substitutions, deletions, or additions, the net effect 
of which does not result in an adverse functional dissimilarity between the reference and 
subject sequences. Typically, such a substantially equivalent sequence varies from one of 
those listed herein by no more than about 35% (i.e., the number of individual residue 

25 substitutions, additions, and/or deletions in a substantially equivalent sequence, as 
compared to the corresponding reference sequence, divided by the total number of 
residues in the substantially equivalent sequence is about 0.35 or less). Such a sequence 
is said to have 65% sequence identity to the listed sequence. In one embodiment, a 
substantially equivalent, e.g., mutant, sequence of the invention varies from a listed 

30 sequence by no more than 30% (70% sequence identity); in a variation of this 

embodiment, by no more than 25% (75% sequence identity); and in a further variation of 

\ 
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this embodiment, by no more than 20% (80% sequence identity) and in a further variation 
of this embodiment, by no more than 10% (90% sequence identity) and in a further 
variation of this embodiment, by no more that 5% (95% sequence identity). Substantially 
equivalent, e.g., mutant, amino acid sequences according to the invention preferably have 
5 at least 80% sequence identity with a listed amino acid sequence, more preferably at least 
85% sequence identity, more preferably at least 90% sequence identity, more preferably 
at least 95% sequence identity, more preferably at least 98% sequence identity, and most 
preferably at least 99% sequence identity. Substantially equivalent nucleotide sequence 
of the invention can have lower percent sequence identities, taking into account, for 

10 example, the redundancy or degeneracy of the genetic code. Preferably, the nucleotide 
sequence has at least about 65% identity, more preferably at least about 75% identity, 
more preferably at least about 80% sequence identity, more preferably at least 85% 
sequence identity, more preferably at least 90% sequence identity, mote preferably at 
least about 95% sequence identity, more preferably at least 98% sequence identity, and 

15 most preferably at least 99% sequence identity. For the purposes of the present 

invention, sequences having substantially equivalent biological activity and substantially 
equivalent expression characteristics are considered substantially equivalent For the 
purposes of determining equivalence, truncation of the mature sequence (e.g., via a 
mutation which creates a spurious stop codon) should be disregarded. Sequence identity 

20 may be determined, e.g., using the Jotun Hein method (Hein, J. (1990) Methods 

Enzymol. 183:626-645). Identity between sequences can also be determined by other 
methods known in the art, e.g. by varying hybridization conditions. 

The term "totipotent" refers to the capability of a cell to differentiate into all of 
the cell types of an adult organism. 

25 The term "transformation" means introducing DNA into a suitable host cell so 

that the DNA is replicable, either as an extrachromosomal element, or by chromosomal 
integration. The term "transfection" refers to the taking up of an expression vector by a 
suitable host cell, whether or not any coding sequences are in fact expressed. The term 
"infection" refers to the introduction of nucleic acids into a suitable host cell by use of a 

30 virus or viral vector. 
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As used herein, an "uptake modulating fragment," UMF, means a series of 
nucleotides which mediate the uptake of a linked DNA fragment into a cell. UMFs can 
be readily identified using known UMFs as a target sequence or target motif with the 
computer-based systems described below. The presence and activity of a UMF can be 
5 confirmed by attaching the suspected UMF to a marker sequence. The resulting nucleic 
acid molecule is then incubated with an appropriate host under appropriate conditions and 
the uptake of the marker sequence is determined. As described above, a UMF will 
increase the frequency of uptake of a linked marker sequence. 

Each of the above terms is meant to encompass all that is described for each, 
10 unless the context dictates otherwise. 

4.2 NUCLEIC ACIDS OF THE INVENTION 

Nucleotide sequences of the invention are set forth in the Sequence listing. 

The isolated polynucleotides of the invention include a polynucleotide comprising 

15 the nucleotide sequences of SEQ ID NO: 1 - 438; a polynucleotide encoding any one of 
the peptide sequences of SEQ ID NO:l - 438; and a polynucleotide comprising the 
nucleotide sequence encoding the mature protein coding sequence of the polynucleotides 
of any one of SEQ ID NO: 1 - 438. The polynucleotides of the present invention also 
include, but are not limited to, a polynucleotide that hybridizes under stringent conditions 

20 to (a) the complement of any of the nucleotides sequences of SEQ ID NO: 1 - 438; (b) 
nucleotide sequences encoding any one of the amino acid sequences set forth in the 
Sequence listing; (c) a polynucleotide which is an allelic variant of any polynucleotide 
recited above; (d) a polynucleotide which encodes a species homolog of any of the 
proteins recited above; or (e) a polynucleotide that encodes a polypeptide comprising a 

25 specific domain or truncation of the polypeptides of SEQ ID NO: 1- 438. Domains of 
interest may depend on the nature of the encoded polypeptide; e.g., domains in receptor- 
like polypeptides include ligand-binding, extracellular, transmembrane, or cytoplasmic 
domains, or combinations thereof; domains in immunoglobulin-like proteins include the 
variable immunoglobulin-like domains; domains in enzyme-like polypeptides include 

30 catalytic and substrate binding domains; and domains in ligand polypeptides include 
receptor-binding domains. 
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The polynucleotides of the invention include naturally occurring or wholly or 
partially synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The 
polynucleotides may include all of the coding region of the cDNA or may represent a 
portion of the coding region of the cDNA. 
5 The present invention also provides genes corresponding to the cDNA sequences 

disclosed herein. The corresponding genes can be isolated in accordance with known 
methods using the sequence information disclosed herein. Such methods include the 
preparation of probes or primers from the disclosed sequence information for identification 
and/or amplification of genes in appropriate genomic libraries or other sources of genomic 

10 materials. Further 5 f and 3' sequence can be obtained using methods known in the art For 
example, full length cDNA or genomic DNA that corresponds to any of the polynucleotides 
of SEQ ID NO: 1 - 438 can be obtained by screening appropriate cDNA or genomic DNA 
libraries under suitable hybridization conditions using any of the polynucleotides of SEQ ID 
NO: 1 - 438 or a portion thereof as a probe. Alternatively, the polynucleotides of SEQ ID 

15 NO: 1 - 438 may be used as the basis for suitable primer(s) that allow identification and/or 
amplification of genes in appropriate genomic DNA or cDNA libraries. 

The nucleic acid sequences of the invention can be assembled from ESTs and 
sequences (including cDNA and genomic sequences) obtained from one or more public 
databases, such as dbEST, gbpri, and UniGene. The EST sequences can provide identifying 

20 sequence information, representative fragment or segment information, or novel segment 
information for the full-length gene. 

The polynucleotides of the invention also provide polynucleotides including 
nucleotide sequences that are substantially equivalent to the polynucleotides recited 
above. Polynucleotides according to the invention can have, e.g., at least about 65%, at 

25 least about 70%, at least about 75%, at least about 80%, 81%, 82%, 83%, 84%, more 
typically at least about 85%, 86%, 87%, 88%, 89%, more typically at least about 90%, 
91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%, 98%, 99% 
sequence identity to a polynucleotide recited above. 

Included within the scope of the nucleic acid sequences of the invention are 

30 nucleic acid sequence fragments that hybridize under stringent conditions to any of the 
nucleotide sequences of SEQ ID NO: 1 - 438, or complements thereof, which fragment is 
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greater than about 5 nucleotides, preferably 7 nucleotides, more preferably greater than 9 
nucleotides and most preferably greater than 17 nucleotides. Fragments of, e.g. 15, 17, or 
20 nucleotides or more that are selective for (i.e. specifically hybridize to any one of the 
polynucleotides of the invention) are contemplated. Probes capable of specifically 
5 hybridizing to a polynucleotide can differentiate polynucleotide sequences of the 
invention from other polynucleotide sequences in the same family of genes or can 
differentiate human genes from genes of other species, and are preferably based on 
unique nucleotide sequences. 

The sequences falling within the scope of the present invention are not limited to 

10 these specific sequences, but also include allelic and species variations thereof. Allelic and 
species variations can be routinely determined by comparing the sequence provided in SEQ 
ID NO: 1 - 438, a representative fragment thereof, or a nucleotide sequence at least 90% 
identical, preferably 95% identical, to SEQ ID NOs: 1 - 438 with a sequence from another 
isolate of the same species. Furthermore, to accommodate codon variability, the invention 

15 includes nucleic acid molecules coding for the same amino acid sequences as do the specific 
ORFs disclosed herein. In other words, in the coding region of an ORF, substitution of one 
codon for another codon that encodes the same amino acid is expressly contemplated. 

The nearest neighbor or homology result for the nucleic acids of the present 
invention, including SEQ ID NOs: 1 - 438, can be obtained by searching a database using an 

20 algorithm or a program. Preferably, a BLAST which stands for Basic Local Alignment 
Search Tool is used to search for local sequence alignments (Altshul, S.F. J Mol. Evol. 36 
290-300 (1993) and Altschul S.F. et al. J. MoL Biol. 21:403^10 (1990)). Alternatively a 
FASTA version 3 search against Genpept, using Fastxy algorithm. 

Species homologs (or orthologs) of the disclosed polynucleotides and proteins are 

25 also provided by the present invention. Species homologs may be isolated and identified 
by making suitable probes or primers from the sequences provided herein and screening a 
suitable nucleic acid source from the desired species. 

The invention also encompasses allelic variants of the disclosed polynucleotides 
or proteins; that is, naturally-occurring alternative forms of the isolated polynucleotide 

30 which also encode proteins which are identical, homologous or related to that encoded by 
the polynucleotides. 
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The nucleic acid sequences of the invention are further directed to sequences 
which encode variants of the described nucleic acids. These amino acid sequence 
variants may be prepared by methods known in the art by introducing appropriate 
nucleotide changes into a native or variant polynucleotide. There are two variables in the 
5 construction of amino acid sequence variants: the location of the mutation and the nature 
of the mutation. Nucleic acids encoding the amino acid sequence variants are preferably 
constructed by mutating the polynucleotide to encode an amino acid sequence that does 
not occur in nature. These nucleic acid alterations can be made at sites that differ in the 
nucleic acids from different species (variable positions) or in highly conserved regions 

10 (constant regions). Sites at such locations will typically be modified in series, e.g., by 
substituting first with conservative choices (e.g., hydrophobic amino acid to a different 
hydrophobic amino acid) and then with more distant choices (e.g., hydrophobic amino 
acid to a charged amino acid), and then deletions or insertions may be made at the target 
site. Amino acid sequence deletions generally range from about 1 to 30 residues, 

15 preferably about 1 to 10 residues, and arc typically contiguous. Amino acid insertions 
include amino- and/or carboxyl-terminal fusions ranging in length from one to one 
hundred or more residues, as well as intrasequence insertions of single or multiple amino 
acid residues. Intrasequence insertions may range generally from about 1 to 10 amino 
residues, preferably from 1 to 5 residues. Examples of terminal insertions include the 

20 heterologous signal sequences necessary for secretion or for intracellular targeting in 
different host cells and sequences such as FLAG or poly-histidine sequences useful for 
purifying the expressed protein. 

In a preferred method, polynucleotides encoding the novel amino acid sequences 
are changed via site-directed mutagenesis. This method uses oligonucleotide sequences 

25 to alter a polynucleotide to encode the desired amino acid variant, as well as sufficient 
adjacent nucleotides on both sides of the changed amino acid to form a stable duplex on 
either side of the site of being changed. In general, the techniques of site-directed 
mutagenesis are well known to those of skill in the art and this technique is exemplified 
by publications such as, Edelman et al., DNA 2:183 (1983). A versatile and efficient 

30 method for producing site-specific changes in a polynucleotide sequence was published 
by Zoller and Smith, Nucleic Acids Res. 10:6487-6500(1982). PGR may also be used to 
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create amino acid sequence variants of the novel nucleic acids. When small amounts of 
template DNA are used as starting material, primer(s) that differs slightly in sequence 
from the corresponding region in the template DNA can generate the desired amino acid 
variant. PCR amplification results in a population of product DNA fragments that differ 
5 from the polynucleotide template encoding the polypeptide at the position specified by 
the primer. The product DNA fragments replace the corresponding region in the plasmid 
and this gives a polynucleotide encoding the desired amino acid variant. 

A further technique for generating amino acid variants is the cassette mutagenesis 
technique described in Wells et al., Gene 34:315 (1985); and other mutagenesis 

10 techniques well known in the art, such as, for example, the techniques in Sambrook et al., 
supra, and Current Protocols in Molecular Biology, Ausubel et al. Due to the inherent 
degeneracy of the genetic code, other DNA sequences which encode substantially the 
same or a functionally equivalent amino acid sequence may be used in the practice of the 
invention for the cloning and expression of these novel nucleic acids. Such DNA 

15 sequences include those which are capable of hybridizing to the appropriate novel nucleic 
acid sequence under stringent conditions. 

Polynucleotides encoding preferred polypeptide truncations of the invention can 
be used to generate polynucleotides encoding chimeric or fusion proteins comprising one 
or more domains of the invention and heterologous protein sequences. 

20 The polynucleotides of the invention additionally include the complement of any 

of the polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA, 
amplified, or synthetic) or RNA. Methods and algorithms for obtaining such 
polynucleotides are well known to those of skill in the art and can include, for example, 
methods for determining hybridization conditions that can routinely isolate 

25 polynucleotides of the desired sequence identities. 

In accordance with the invention, polynucleotide sequences comprising the 
mature protein coding sequences corresponding to any one of SEQ ID NO: 1-438, or 
functional equivalents thereof, may be used to generate recombinant DNA molecules that 
direct the expression of that nucleic acid, or a functional equivalent thereof, in 

30 appropriate host cells. Also included are the cDNA inserts of any of the clones identified 
herein. 
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A polynucleotide according to the invention can be joined to any of a variety of 
other nucleotide sequences by well-established recombinant DNA techniques (see 
Sambrook J et al. (1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratory, NY). Useful nucleotide sequences for joining to polynucleotides include an 
5 assortment of vectors, e.g., plasmids, cosmids, lambda phage derivatives, phagemids, and 
the like, that are well known in the art Accordingly, the invention also provides a vector 
including a polynucleotide of the invention and a host cell containing the polynucleotide. 
In general, the vector contains an origin of replication functional in at least one organism, 
convenient restriction endonuclease sites, and a selectable marker for the host cell. 

10 Vectors according to the invention include expression vectors, replication vectors, probe 
generation vectors, and sequencing vectors. A host cell according to the invention can be 
a prokaryotic or eukaryotic cell and can be a unicellular organism or part of a 
multicellular organism. 

The present invention further provides recombinant constructs comprising a 

15 nucleic acid having any of the nucleotide sequences of SEQ ID NOs: 1 - 438 or a 

fragment thereof or any other polynucleotides of the invention. In one embodiment, the 
recombinant constructs of the present invention comprise a vector, such as a plasmid or 
viral vector, into which a nucleic acid having any of the nucleotide sequences of SEQ ID 
NOs: 1 - 438 or a fragment thereof is inserted, in a forward or reverse orientation. In the 

20 case of a vector comprising one of the ORFs of the present invention, the vector may 
further comprise regulatory sequences, including for example, a promoter, operably 
linked to the ORF. Large numbers of suitable vectors and promoters are known to those 
of skill in the art and are commercially available for generating the recombinant 
constructs of the present invention. The following vectors are provided by way of 

25 example. Bacterial: pBs, phagescript, PsiX174, pBluescript SK, pBs KS, pNH8a, 
pNH16a, pNH18a, pNH46a (Stratagene); P Trc99A, pKK223-3, pKK233-3, pDR540, 
pRTT5 (Pharmacia). Eukaryotic: pWLneo, pSV2cat, pOG44, PXTT, pSG (Stratagene) 
pSVK3, pBPV, pMSG, pSVL (Pharmacia). 

The isolated polynucleotide of the invention may be operably linked to an 

30 expression control sequence such as the pMT2 or pED expression vectors disclosed in 
Kaufman et al., Nucleic Acids Res. 19, 4485-4490 (1991), in order to produce the protein 
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recombinantly. Many suitable expression control sequences are known in the art. 
General methods of expressing recombinant proteins are also known and are exemplified 
in R. Kaufman, Methods in Enzymology 185, 537-566 (1990). As defined herein 
"operably linked" means that the isolated polynucleotide of the invention and an 
5 expression control sequence are situated within a vector or cell in such a way that the 
protein is expressed by a host cell which has been transformed (transfected) with the 
ligated polynucleotide/expression control sequence. 

Promoter regions can be selected from any desired gene using CAT 
(chloramphenicol transferase) vectors or other vectors with selectable markers. Two 

10 appropriate vectors are pKK232-8 and pCM7. Particular named bacterial promoters 
include lad, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV 
immediate early, HS V thymidine kinase, early and late SV40, LTRs from retrovirus, and 
mouse metallothionein-I. Selection of the appropriate vector and promoter is well within 
the level of ordinary skill in the art. Generally, recombinant expression vectors will 

15 include origins of replication and selectable markers permitting transformation of the host 
cell, e.g., the ampicillin resistance gene of E. coli and 5. cerevisiae TRP1 gene, and a 
promoter derived from a highly-expressed gene to direct transcription of a downstream 
structural sequence. Such promoters can be derived from operons encoding glycolytic 
enzymes such as 3-phosphoglycerate kinase (PGK), a-factor, acid phosphatase, or heat 

20 shock proteins, among others. The heterologous structural sequence is assembled in 

appropriate phase with translation initiation and termination sequences, and preferably, a 
leader sequence capable of directing secretion of translated protein into the periplasmic 
space or extracellular medium. Optionally, the heterologous sequence can encode a 
fusion protein including an amino terminal identification peptide imparting desired 

25 characteristics, e.g., stabilization or simplified purification of expressed recombinant 
product Useful expression vectors for bacterial use are constructed by inserting a 
structural DNA sequence encoding a desired protein together with suitable translation 
initiation and termination signals in operable reading phase with a functional promoter. 
The vector will comprise one or more phenotypic selectable markers and an origin of 

30 replication to ensure maintenance of the vector and to, if desirable, provide amplification 
within the host. Suitable prokaryotic hosts for transformation include E. coli, Bacillus 
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subtilis, Salmonella typhimurium and various species within the genera Pseudomonas, 
Streptomyces, and Staphylococcus, although others may also be employed as a matter of 
choice. 

As a representative but non-limiting example, useful expression vectors for 
5 bacterial use can comprise a selectable marker and bacterial origin of replication derived 
from commercially available plasmids comprising genetic elements of the well known 
cloning vector pBR322 (ATCC 37017). Such commercial vectors include, for example, 
pKK223-3 (Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (Promega 
Biotech, Madison, WI, USA). These pBR322 "backbone" sections are combined with an 

10 appropriate promoter and the structural sequence to be expressed Following 

transformation of a suitable host strain and growth of the host strain to an appropriate cell 
density, the selected promoter is induced or derepressed by appropriate means (e.g., 
temperature shift or chemical induction) and cells are cultured for an additional period. 
Cells are typically harvested by centrifugation, disrupted by physical or chemical means, 

15 and the resulting crude extract retained for further purification. 

Polynucleotides of the invention can also be used to induce immune responses. 
For example, as described in Fan et aL, Nat. Biotech. 17:870-872 (1999), incorporated 
herein by reference, nucleic acid sequences encoding a polypeptide may be used to 
generate antibodies against the encoded polypeptide following topical administration of 

20 naked plasmid DNA or following injection, and preferably intra-muscular injection of the 
DNA. The nucleic acid sequences are preferably inserted in a recombinant expression 
vector and may be in the form of naked DNA. 

4.3 ANTISENSE 

25 Another aspect of the invention pertains to isolated antisense nucleic acid 

molecules that are hybridizable to or complementary to the nucleic acid molecule 
comprising the nucleotide sequence of SEQ ID NO: 1 - 438, or fragments, analogs or 
derivatives thereof. An "antisense" nucleic acid comprises a nucleotide sequence that is 
complementary to a "sense" nucleic acid encoding a protein, e.g., complementary to the 

30 coding strand of a double-stranded cDNA molecule or complementary to an mRNA 
sequence. In specific aspects, antisense nucleic acid molecules are provided that 
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comprise a sequence complementary to at least about 10, 25, 50, 100, 250 or 500 
nucleotides or an entire coding strand, or to only a portion thereof. Nucleic acid 
molecules encoding fragments, homologs, derivatives and analogs of a protein of any of 
SEQ ID NO: 1 - 438 or antisense nucleic acids complementary to a nucleic acid sequence 
5 of SEQ ID NO: 1 - 438 are additionally provided. 

In one embodiment, an antisense nucleic acid molecule is antisense to a "coding 
region" of the coding strand of a nucleotide sequence of the invention. The term "coding 
region" refers to the region of the nucleotide sequence comprising codons which are 
translated into amino acid residues. In another embodiment, the antisense nucleic acid 

10 molecule is antisense to a "noncoding region" of the coding strand of a nucleotide 

sequence of the invention. The term "noncoding region" refers to 5' and 3 ! sequences that 
flank the coding region that are not translated into amino acids (i.e., also referred to as 5' 
and 3 f untranslated regions). 

Given the coding strand sequences encoding a nucleic acid disclosed herein (e.g. , 

15 SEQ ID NO: 1 - 438, antisense nucleic acids of the invention can be designed according 
to the rules of Watson and Crick or Hoogsteen base pairing. The antisense nucleic acid 
molecule can be complementary to the entire coding region of an mRNA, but more 
preferably is an oligonucleotide that is antisense to only a portion of the coding or 
noncoding region of an mRNA. For example, the antisense oligonucleotide can be 

20 complementary to the region surrounding the translation start site of an mRNA. An 
antisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 
50 nucleotides in length. An antisense nucleic acid of the invention can be constructed 
using chemical synthesis or enzymatic ligation reactions using procedures known in the 
art. For example, an antisense nucleic acid (e.g., an antisense oligonucleotide) can be 

25 chemically synthesized using naturally occurring nucleotides or variously modified 
nucleotides designed to increase the biological stability of the molecules or to increase 
the physical stability of the duplex formed between the antisense and sense nucleic acids, 
e.g., phosphorothioate derivatives and acridine substituted nucleotides can be used. 
Examples of modified nucleotides that can be used to generate the antisense 

30 nucleic acid include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil, 
hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 
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5-carboxymethylaminomethyl-2-thiouridine, 5H:arboxymethylaminomethyluracil, 
dihydrouracil, beta-D-galactosylqueosine, inosine, N6-isopentenyladenine, 
lnmethylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, 
2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine, 7-methylguanine, 
5 5-methylaminomethyldracil, 5-methoxyairanomethyl-2-thiouracil, 

beta-D-mannosylqueosine, S-methoxycarboxymethyluracil, 5-methoxyuracil, 
2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine, 
pseudouracil, queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 
5-methyluracil, uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 

10 5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 

2,6-cUaminopurine. Alternatively, the antisense nucleic acid can be produced biologically 
using an expression vector into which a nucleic acid has been subcloned in an antisense 
orientation RNA transcribed from the inserted nucleic acid will be of an antisense 
orientation to a target nucleic acid of interest, described further in the following 

15 subsection). 

The antisense nucleic acid molecules of the invention are typically administered 
to a subject or generated in situ such that they hybridize with or bind to cellular mRNA 
and/or genomic DNA encoding a protein according to the invention to thereby inhibit 
expression of the protein, e.g., by inhibiting transcription and/or translation. The 

20 hybridization can be by conventional nucleotide complementarity to form a stable duplex, 
or, for example, in the case of an antisense nucleic acid molecule that binds to DNA 
duplexes, through specific interactions in the major groove of the double helix. An 
example of a route of administration of antisense nucleic acid molecules of the invention 
includes direct injection at a tissue site. Alternatively, antisense nucleic acid molecules 

25 can be modified to target selected cells and then administered systemically. For example, 
for systemic administration, antisense molecules can be modified such that they 
specifically bind to receptors or antigens expressed on a selected cell surface, e.g., by 
linking the antisense nucleic acid molecules to peptides or antibodies that bind to cell 
surface receptors or antigens. The antisense nucleic acid molecules can also be delivered 

30 to cells using the vectors described herein. To achieve sufficient intracellular 

concentrations of antisense molecules, vector constructs in which the antisense nucleic 
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acid molecule is placed under the control of a strong pol II or pol III promoter are 
preferred. 

In yet another embodiment, the antisense nucleic acid molecule of the invention is 
an oc-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms 
5 specific double-stranded hybrids with complementary RNA in which, contrary to the usual 
a-units, die strands run parallel to each other (Gaultier etal. (1987) Nucleic Acids Res 15: 
6625-6641). The antisense nucleic acid molecule can also comprise a 
2 , -o-methylribonucleotide (Inoue et al. (1987) Nucleic Acids Res 15: 6131-6148) or a 
chimeric RNA -DNA analogue (Inoue et al. (1987) FEBS Lett 215: 327-330). 

10 

4.4 RIBOZYMES AND PNA MOIETIES 

In still another embodiment, an antisense nucleic acid of the invention is a 
ribozyme. Ribozymes are catalytic RNA molecules with ribonuclease activity that are 
capable of cleaving a single-stranded nucleic acid, such as an mRNA, to which they have 

15 a complementary region. Thus, ribozymes (e.g., hammerhead ribozymes (described in 
Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used to catalytically cleave 
mRNA transcripts to thereby inhibit translation of an mRNA. A ribozyme having 
specificity for a nucleic acid of the invention can be designed based upon the nucleotide 
sequence of a DNA disclosed herein (Le., SEQ ID NO: 1 - 438). For example, a 

20 derivative of Tetrahymena 1^19 TVS RNA can be constructed in which the nucleotide 

sequence of the active site is complementary to the nucleotide sequence to be cleaved in a 
SECX-encoding mRNA. See,e.g.,CechefaJ. U.S. Pat. No. 4,987,071; and Cech et al. 
U.S. Pat. No. 5,116,742. Alternatively, SECX mRNA can be used to select a catalytic 
RNA having a specific ribonuclease activity from a pool of RNA molecules. See, e.g., 

25 Bartel et al, (1993) Science 261:1411-1418. 

Alternatively, gene expression can be inhibited by targeting nucleotide sequences 
complementary to the regulatory region (e.g., promoter and/or enhancers) to form triple 
helical structures that prevent transcription of the gene in target cells. See generally, 
Helene. (1991) Anticancer Drug Des. 6: 569-84; Helene. etal. (1992) Ann. N.Y. Acad 

30 Sci. 660:27-36; and Maher (1992) Bioassays 14: 807-15. 
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In various embodiments, the nucleic acids of the invention can be modified at the 
base moiety, sugar moiety or phosphate backbone to improve, e.g., the stability, 
hybridization, or solubility of the molecule. For example, the deoxyribose phosphate 
backbone of the nucleic acids can be modified to generate peptide nucleic acids (see 
5 Hyrup etal (1996) Bioorg Med Chem 4: 5-23). As used herein, the terms "peptide 
nucleic acids" or "PNAs" refer to nucleic acid mimics, e.g., DNA mimics, in which the 
deoxyribose phosphate backbone is replaced by a pseudopeptide backbone and only the 
four natural nucleobases are retained. The neutral backbone of PNAs has been shown to 
allow for specific hybridization to DNA and RNA under conditions of low ionic strength. 

10 The synthesis of PNA oligomers can be performed using standard solid phase peptide 
synthesis protocols as described in Hyrup et al (1996) above; Perry-O'Keefe et al. (1996) 
PNAS 93: 14670-675. 

PNAs of the invention can be used in therapeutic and diagnostic applications. For 
example, PNAs can be used as antisense or antigene agents for sequence-specific 

15 modulation of gene expression by, e.g., inducing transcription or translation arrest or 
inhibiting replication. PNAs of the invention can also be used, e.g., in the analysis of 
single base pair mutations in a gene by, e.g., PNA directed PCR clamping; as artificial 
restriction enzymes when used in combination with other enzymes, e.g., SI nucleases 
(Hyrup B. (1996) above); or as probes or primers for DNA sequence and hybridization 

20 (Hyrup et al (1996), above; Perry-O'Keefe (1996), above). 

In another embodiment, PNAs of the invention can be modified, e.g., to enhance 
their stability or cellular uptake, by attaching lipophilic or other helper groups to PNA, by 
the formation of PNA-DNA chimeras, or by the use of liposomes or other techniques of 
drug delivery known in the art. For example, PNA-DNA chimeras can be generated that 

25 may combine the advantageous properties of PNA and DNA. Such chimeras allow DNA 
recognition enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA 
portion while the PNA portion would provide high binding affinity and specificity. 
PNA-DNA chimeras can be linked using linkers of appropriate lengths selected in terms 
of base stacking, number of bonds between the nucleobases, and orientation (Hyrup 

30 (1996) above). The synthesis of PNA-DNA chimeras can be performed as described in 
Hyrup (1996) above and Finn et al (1996) Nucl Acids Res 24: 3357-63. For example, a 
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DNA chain can be synthesized on a solid support using standard phosphoramidite 
coupling chemistry, and modified nucleoside analogs, e.g., 

5 , -(4-methoxytrityl)amino-5-deoxy-thymidine phosphoramidite, can be used between the 
PNA and the 5' end of DNA (Mag et al (1989) Nucl Acid Res 17: 5973-88). PNA 
5 monomers are then coupled in a stepwise manner to produce a chimeric molecule with a 
5 ! PNA segment and a 3' DNA segment (Finn et al. (1996) above). Alternatively, 
chimeric molecules can be synthesized with a 5* DNA segment and a 3' PNA segment. 
See, Petersen et al (1975) Bioorg Med Chem Lett 5: 1 1 19-1 1 124. 

In other embodiments, the oligonucleotide may include other appended groups 

10 such as peptides (e.g., for targeting host cell receptors in vivo), or agents facilitating 

transport across the cell membrane (see, e.g. , Letsinger et al., 1989, Proc. Natl. Acad Sci. 
U.S.A 86:6553-6556; Lemaitre et al, 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT 
Publication No. W088/09810) or the blood-brain barrier (see, e.g., PCT Publication No. 
W089/10134). In addition, oligonucleotides can be modified with hybridization triggered 

15 cleavage agents (See, e.g., Krol et al, 1988, BioTechniques 6:958-976) or intercalating 
agents. (See, e.g., Zon, 1988, Pharm. Res. 5: 539-549). To this end, the oligonucleotide 
may be conjugated to another molecule, e.g., a peptide, a hybridization triggered 
cross-linking agent, a transport agent, a hybridization-triggered cleavage agent, etc. 

20 4.5 HOSTS 

The present invention further provides host cells genetically engineered to contain 
the polynucleotides of the invention. For example, such host cells may contain nucleic 
acids of the invention introduced into the host cell using known transformation, 
transfection or infection methods. The present invention still further provides host cells 

25 genetically engineered to express the polynucleotides of the invention, wherein such 
polynucleotides are in operative association with a regulatory sequence heterologous to 
the host cell which drives expression of the polynucleotides in the cell. 

Knowledge of nucleic acid sequences allows for modification of cells to permit, 
or increase, expression of endogenous polypeptide. Cells can be modified (e.g., by 

30 homologous recombination) to provide increased polypeptide expression by replacing, in 
whole or in part, the naturally occurring promoter with all or part of a heterologous 
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promoter so that the cells express the polypeptide at higher levels. The heterologous 
promoter is inserted in such a manner that it is operatively linked to the encoding 
sequences. See, for example, PCT International Publication No. WO94/12650, PCT 
International Publication No. WO92/20808, and PCT International Publication No. 
5 W09 1/09955. It is also contemplated that, in addition to heterologous promoter DNA, 
amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which 
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) 
and/or intron DNA may be inserted along with the heterologous promoter DNA. If 
linked to the coding sequence, amplification of the marker DNA by standard selection 

10 methods results in co-amplification of the desired protein coding sequences in the cells. 

The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a 
lower eukaryotic host cell, such as a yeast cell, or the host cell can be a prokaryotic cell, 
such as a bacterial cell. Introduction of the recombinant construct into the host cell can 
be effected by calcium phosphate transfection, DEAE, dextran mediated transfection, or 

15 electroporation (Davis, L. et al., Basic Methods in Molecular Biology (1986)). The host 
cells containing one of the polynucleotides of the invention, can be used in conventional 
manners to produce the gene product encoded by the isolated fragment (in the case of an 
ORF) or can be used to produce a heterologous protein under the control of the EMR 
Any host/vector system can be used to express one or more of the ORFs of the 

20 present invention. These include, but are not limited to, eukaryotic hosts such as HeLa 
cells, Cv-1 cell, COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E. 
coli and B. subtilis. The most preferred cells are those which do not normally express the 
particular polypeptide or protein or which expresses the polypeptide or protein at low 
natural level Mature proteins can be expressed in mammalian cells, yeast, bacteria, or 

25 other cells under the control of appropriate promoters. Cell-free translation systems can 
also be employed to produce such proteins using RNAs derived from the DNA constructs 
of the present invention. Appropriate cloning and expression vectors for use with 
prokaryotic and eukaryotic hosts are described by Sambrook, et al., in Molecular 
Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor, New York (1989), 

30 the disclosure of which is hereby incorporated by reference. 
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Various mammalian cell culture systems can also be employed to express 
recombinant protein. Examples of mammalian expression systems include the COS-7 
lines of monkey kidney fibroblasts, described by Gluzman, Cell 23:175 (1981). Other 
cell lines capable of expressing a compatible vector are, for example, the C127, monkey 
5 COS cells, Chinese Hamster Ovary (CHO) cells, human kidney 293 cells, human 
epidermal A431 cells, human Colo205 cells, 3T3 cells, CV-1 cells, other transformed 
primate cell lines, normal diploid cells, cell strains derived from in vitro culture of 
primary tissue, primary explants, HeLa cells, mouse L cells, BHK, HL-60, U937, HaK or 
Jurkat cells. Mammalian expression vectors will comprise an origin of replication, a 

10 suitable promoter and also any necessary ribosome binding sites, polyadenylation site, 
splice donor and acceptor sites, transcriptional termination sequences, and 5' flanking 
. nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for 
example, S V40 origin, early promoter, enhancer, splice, and polyadenylation sites may be 
used to provide the required nontranscribed genetic elements. Recombinant polypeptides 

15 and proteins produced in bacterial culture are usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion 
chromatography steps. Protein refolding steps can be used, as necessary, in completing 
configuration of the mature protein. Finally, high performance liquid chromatography 
(HPLC) can be employed for final purification steps. Microbial cells employed in 

20 expression of proteins can be disrupted by any convenient method, including freeze-thaw 
cycling, sonication, mechanical disruption, or use of cell lysing agents. 

Alternatively, it may be possible to produce the protein in lower eukaryotes such 
as yeast or insects or in prokaryotes such as bacteria. Potentially suitable yeast strains 
include Saccharomyces cerevisiae, Schizosaccharomyces pombe, Kluyveromyces strains, 

25 Candida, or any yeast strain capable of expressing heterologous proteins. Potentially 
suitable bacterial strains include Escherichia coli, Bacillus subtilis, Salmonella 
typhimurium, or any bacterial strain capable of expressing heterologous proteins. If the 
protein is made in yeast or bacteria, it may be necessary to modify the protein produced 
therein, for example by phosphorylation or glycosylation of the appropriate sites, in order 

30 to obtain the functional protein. Such covalent attachments may be accomplished using 
known chemical or enzymatic methods. 
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In another embodiment of the present invention, cells and tissues may be 
engineered to express an endogenous gene comprising the polynucleotides of the 
invention under the control of inducible regulatory elements, in which case the regulatory 
sequences of the endogenous gene may be replaced by homologous recombination. As 
5 described herein, gene targeting can be used to replace a gene's existing regulatory region 
with a regulatory sequence isolated from a different gene or a novel regulatory sequence 
synthesized by genetic engineering methods. Such regulatory sequences may be 
comprised of promoters, enhancers, scaffold-attachment regions, negative regulatory 
elements, transcriptional initiation sites, regulatory protein binding sites or combinations 

10 of said sequences. Alternatively, sequences which affect the .structure or stability of the 
RNA or protein produced may be replaced, removed, added, or otherwise modified by 
targeting. These sequence include polyadenylation signals, mRNA stability elements, 
splice sites, leader sequences for enhancing or modifying transport or secretion properties 
of the protein, or other sequences which alter or improve the function or stability of 

15 protein or RNA molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing 
the gene under the control of the new regulatory sequence, e.g., inserting a new promoter 
or enhancer or both upstream of a gene. Alternatively, the targeting event may be a 
simple deletion of a regulatory element, such as the deletion of a tissue-specific negative 

20 regulatory element. Alternatively, the targeting event may replace an existing element; 
for example, a tissue-specific enhancer can be replaced by an enhancer that has broader 
or different cell-type specificity than the naturally occurring elements. Here, the 
naturally occurring sequences are deleted and new sequences are added. In all cases, the 
identification of the targeting event may be facilitated by the use of one or more 

25 selectable marker genes that are contiguous with the targeting DNA, allowing for the 
selection of cells in which the exogenous DNA has integrated into the host cell genome. 
The identification of the targeting event may also be facilitated by the use of one or more 
marker genes exhibiting the property of negative selection, such that the negatively 
selectable marker is linked to the exogenous DNA, but configured such that the 

30 negatively selectable marker flanks the targeting sequence, and such that a correct 

homologous recombination event with sequences in the host cell genome does not result 
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in the stable integration of the negatively selectable marker. Markers useful for this 
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial 
xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance 
5 with this aspect of the invention are more particularly described in U.S. Patent No. 
5,272,071 to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International 
Application No. PCT/US92/09627 (WO93/09222) by Selden et al.; and International 
Application No. PCT/US90/06436 (WO91/06667) by Skoultchi et al., each of which is 
incorporated by reference herein in its entirety. 

10 

4.6 POLYPEPTIDES OF THE INVENTION 

The isolated polypeptides of the invention include, but are not limited to, a 
polypeptide comprising: the amino acid sequences set forth as any one of SEQ ID NO: 1- 
438 or an amino acid sequence encoded by any one of the nucleotide sequences SEQ ID 

15 NOs: 1 - 438 or the corresponding full length or mature protein. Polypeptides of the 

invention also include polypeptides preferably with biological or immunological activity 
that are encoded by: (a) a polynucleotide having any one of the nucleotide sequences set 
forth in SEQ ID NOs: 1 - 438 or (b) polynucleotides encoding any one of the amino acid 
sequences set forth as SEQ ID NO: 1-438 or (c) polynucleotides that hybridize to the 

20 complement of the polynucleotides of either (a) or (b) under stringent hybridization 
conditions. The invention also provides biologically active or immunologically active 
variants of any of the amino acid sequences set forth as SEQ ID NO: 1-438 or the 
corresponding full length or mature protein; and "substantial equivalents" thereof (e.g., 
with at least about 65%, at least about 70%, at least about 75%, at least about 80%, at 

25 least about 85%, 86%, 87%, 88%, 89%, at least about 90%, 91%, 92%, 93%, 94%, 
typically at least about 95%, 96%, 97%, more typically at least about 98%, or most 
typically at least about 99% amino acid identity) that retain biological activity. 
Polypeptides encoded by allelic variants may have a similar, increased, or decreased 
activity compared to polypeptides comprising SEQ ID NO: 1-438. 

30 Fragments of the proteins of the present invention which are capable of exhibiting 

biological activity are also encompassed by the present invention. Fragments of the 
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protein may be in linear form or they may be cyclized using known methods, for 
example, as described in H. U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and 
in R. S. McDowell, et al, J. Amer. Chem. Soc. 114, 9245-9253 (1992), both of which are 
incorporated herein by reference. Such fragments may be fused to carrier molecules such 
5 as immunoglobulins for many purposes, including increasing the valency of protein 
binding sites. 

The present invention also provides both full-length and mature forms (for 
example, without a signal sequence or precursor sequence) of the disclosed proteins. The 
protein coding sequence is identified in the sequence listing by translation of the 

10 disclosed nucleotide sequences. The mature form of such protein may be obtained by 
expression of a full-length polynucleotide in a suitable mammalian cell or other host cell. 
The sequence of the mature form of the protein is also determinable from the amino acid 
sequence of the full-length form. Where proteins of the present invention are membrane 
bound, soluble forms of the proteins are also provided. In such forms, part or all of the 

15 regions causing the proteins to be membrane bound are deleted so that the proteins are 
fully secreted from the cell in which they are expressed. 

Protein compositions of the present invention may further comprise an acceptable 
carrier, such as a hydrophilic, e.g., pharmaceutical^ acceptable, carrier. 

The present invention further provides isolated polypeptides encoded by the 

20 nucleic acid fragments of the present invention or by degenerate variants of the nucleic 
acid fragments of the present invention. By "degenerate variant" is intended nucleotide 
fragments which differ from a nucleic acid fragment of the present invention (e.g., an 
ORF) by nucleotide sequence but, due to the degeneracy of the genetic code, encode an 
identical polypeptide sequence. Preferred nucleic acid fragments of the present invention 

25 are the ORFs that encode proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of 
the isolated polypeptides or proteins of the present invention. At the simplest level, the 
amino acid sequence can be synthesized using commercially available peptide 
synthesizers. The synthetically-constructed protein sequences, by virtue of sharing 

30 primary, secondary or tertiary structural and/or conformational characteristics with 
proteins may possess biological properties in common therewith, including protein 

34 



WO 02/081731 PCT/US02/01222 



activity. This technique is particularly useful in producing small peptides and fragments 
of larger polypeptides. Fragments are useful, for example, in generating antibodies 
against the native polypeptide. Thus, they may be employed as biologically active or 
immunological substitutes for natural, purified proteins in screening of therapeutic 

5 compounds and in immunological processes for the development of antibodies. 

.» 

The polypeptides and proteins of the present invention can alternatively be 
purified from cells which have been altered to express the desired polypeptide or protein. 
As used herein, a cell is said to be altered to express a desired polypeptide or protein 
when the cell, through genetic manipulation, is made to produce a polypeptide or protein 
10 which it normally does not produce or which the cell normally produces at a lower level. 
One skilled in the art can readily adapt procedures for introducing and expressing either 
recombinant or synthetic sequences into eukaryotic or prokaryotic cells in order to 
generate a cell which produces one of the polypeptides or proteins of the present 
invention. 

15 The invention also relates to methods for producing a polypeptide comprising 

growing a culture of host cells of the invention in a suitable culture medium, and 
purifying the protein from the cells or the culture in which the cells are grown. For 
example, the methods of the invention include a process for producing a polypeptide in 
which a host cell containing a suitable expression vector that includes a polynucleotide of 

20 the invention is cultured under conditions that allow expression of the encoded 

polypeptide. The polypeptide can be recovered from the culture, conveniently from the 
culture medium, or from a lysate prepared from the host cells and further purified. 
Preferred embodiments include those in which the protein produced by such process is a 
full length or mature form of the protein. 

25 In an alternative method, the polypeptide or protein is purified from bacterial 

cells which naturally produce the polypeptide or protein. One skilled in the art can 
readily follow known methods for isolating polypeptides and proteins in order to obtain 
one of the isolated polypeptides or proteins of the present invention. These include, but 
are not limited to, immunochromatography, HPLC, size-exclusion chromatography, 

30 ion-exchange chromatography, and immuno-affinity chromatography. See, e.g., Scopes, 
Protein Purification: Principles and Practice, Springer-Verlag (1994); Sambrook, et al., 
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in Molecular Cloning: A Laboratory Manual; Ausubel et al., Current Protocols in 
Molecular Biology. Polypeptide fragments that retain biological/immunological activity 
include fragments comprising greater than about 100 amino acids, or greater than about 
200 amino acids, and fragments that encode specific protein domains. 
5 The purified polypeptides can be used in in vitro binding assays which are well 

known in the art to identify molecules which bind to the polypeptides. These molecules 
include but are not limited to, for e.g., small molecules, molecules from combinatorial 
libraries, antibodies or other proteins. The molecules identified in the binding assay are 
then tested for antagonist or agonist activity in in vivo tissue culture or animal models 
10 that are well known in the art. In brief, the molecules are titrated into a plurality of cell 
cultures or animals and then tested for either cell/animal death or prolonged survival of 
the animal/cells. 

In addition, the peptides of the invention or molecules capable of binding to the 
peptides may be complexed with toxins, e.g., ricin or cholera, or with other compounds 
15 that are toxic to cells. The toxin-binding molecule complex is then targeted to a tumor or 
other cell by the specificity of the binding molecule for SEQ ID NO: 1-438. 

The protein of the invention may also be expressed as a product of transgenic 
animals, e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep which 
are characterized by somatic or germ cells containing a nucleotide sequence encoding the 
20 protein. 

The proteins provided herein also include proteins characterized by amino acid 
sequences similar to those of purified proteins but into which modification are naturally 
provided or deliberately engineered. For example, modifications, in the peptide or DNA 
sequence, can be made by those skilled in the art using known techniques. Modifications 

25 of interest in the protein sequences may include the alteration, substitution, replacement, 
insertion or deletion of a selected amino acid residue in the coding sequence. For 
example, one or more of the cysteine residues may be deleted or replaced with another 
amino acid to alter the conformation of the molecule. Techniques for such alteration, 
substitution, replacement, insertion or deletion are well known to those skilled in the art 

30 (see, e.g., U.S. Pat. No. 4,518,584). Preferably, such alteration, substitution, replacement, 
insertion or deletion retains the desired activity of the protein. Regions of the protein that 
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are important for the protein function can be determined by various methods known in 
the art including the alanine-scanning method which involved systematic substitution of 
single or strings of amino acids with alanine, followed by testing the resulting 
alanine-containing variant for biological activity. This type of analysis determines the 
5 importance of the substituted amino acid(s) in biological activity. Regions of the protein 
that are important for protein function may be determined by the eMATRIX program. 

Other fragments and derivatives of the sequences of proteins which would be 
expected to retain protein activity in whole or in part and are useful for screening or other 
immunological methodologies may also be easily made by those skilled in the art given 

10 the disclosures herein. Such modifications are encompassed by the present invention. 

The protein may also be produced by operably linking the isolated polynucleotide 
of the invention to suitable control sequences in one or more insect expression vectors, 
and employing an insect expression system. Materials and methods for 
baculovirus/insect cell expression systems are commercially available in kit form from, 

15 e.g., Ihvitrogen, San Diego, Calif., U.S.A. (the MaxBat™ kit), and such methods are well 
known in the art, as described in Summers and Smith, Texas Agricultural Experiment 
Station Bulletin No. 1555 (1987), incorporated herein by reference. As used herein, an 
insect cell capable of expressing a polynucleotide of the present invention is 
"transformed." 

20 The protein of the invention may be prepared by culturing transformed host cells 

under culture conditions suitable to express the recombinant protein. The resulting 
expressed protein may then be purified from such culture (Le. 9 from culture medium or 
cell extracts) using known purification processes, such as gel filtration and ion exchange 
chromatography. The purification of the protein may also include an affinity column 

25 containing agents which will bind to the protein; one or more column steps over such 
affinity resins as concanavalin A-agarose, heparin-toyopearl™ or Cibacrom blue 3GA 
Sepharose™; one or more steps involving hydrophobic interaction chromatography using 
such resins as phenyl ether, butyl ether, or propyl ether; or immunoaffinity 
chromatography. 

30 Alternatively, the protein of the invention may also be expressed in a form which 

will facilitate purification. For example, it may be expressed as a fusion protein, such as 
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those of maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin 
(TRX), or as a His tag. Kits for expression and purification of such fusion proteins are 
commercially available from New England BioLab (Beverly, Mass.), Pharmacia 
(Piscataway, N J.) and Invitrogen, respectively. The protein can also be tagged with an 
5 epitope and subsequently purified by using a specific antibody directed to such epitope. 
One such epitope ("FLAG®") is commercially available from Kodak (New Haven, 
Conn.). 

Finally, one or more reverse-phase high performance liquid chromatography (RP- 
HPLC) steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant 

10 methyl or other aliphatic groups, can be employed to further purify the protein. Some or 
all of the foregoing purification steps, in various combinations, can also be employed to 
provide a substantially homogeneous isolated recombinant protein. The protein thus 
purified is substantially free of other mammalian proteins and is defined in accordance 
with the present invention as an "isolated protein." 

15 The polypeptides of the invention include analogs (variants). This embraces 

fragments, as well as peptides in which one or more amino acids has been deleted, 
inserted, or substituted. Also, analogs of the polypeptides of the invention embrace 
fusions of the polypeptides or modifications of the polypeptides of the invention, wherein 
the polypeptide or analog is fused to another moiety or moieties, e.g., targeting moiety or 

20 another therapeutic agent. Such analogs may exhibit improved properties such as activity 
and/or stability. Examples of moieties which may be fused to the polypeptide or an 
analog include, for example, targeting moieties which provide for the delivery of 
polypeptide to pancreatic cells, e.g., antibodies to pancreatic cells, antibodies to immune 
cells such as T-cells, monocytes, dendritic cells, granulocytes, etc., as well as receptor 

25 and ligands expressed on pancreatic or immune cells. Other moieties which may be 
fused to the polypeptide include therapeutic agents which are used for treatment, for 
example, immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3 
antibodies and steroids. Also, polypeptides may be fused to immune modulators, and 
other cytokines such as alpha or beta interferon. 

30 
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4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE 
IDENTITY AND SIMILARITY 

Preferred identity and/or similarity are designed to give the largest match 
between the sequences tested. Methods to determine identity and similarity are codified 
5 in computer programs including, but are not limited to, the GCG program package, 
including GAP (Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics 
Computer Group, University of Wisconsin, Madison, WI), BLASTP, BLASTN, 
BLASTX, FASTA (Altschul, SJF. et al., J. Molec. Biol. 215:403^10 (1990), PSI-BLAST 
(Altschul S.F. et al., Nucleic Acids Res. vol. 25, pp. 3389-3402, herein incorporated by 

10 reference), eMatrix software (Wu et al., J. Comp. Biol., Vol. 6, pp. 219-235 (1999), 

herein incorporated by reference), eMotif software (Nevill-Manning et al, ISMB-97, Vol. 
4, pp. 202-209, herein incorporated by reference), pFam software (Sonnhammer et al., 
Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein incorporated by reference) 
and the Kyte-Doolittle hydrophobocity prediction algorithm (J. Mol Biol, 157, pp. 105-31 

15 (1982), incorporated herein by reference). The BLAST programs are publicly available 
from the National Center for Biotechnology Information (NCBI) and other sources 
(BLAST Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., 
et al., J. Mol. Biol. 215:403-410 (1990). 

4.7 CHIMERIC AND FUSION PROTEINS 

20 The invention also provides chimeric or fusion proteins. As used herein, a 

"chimeric protein" or "fusion protein" comprises a* polypeptide of the invention 
operatively linked to another polypeptide. Within a fusion protein the polypeptide 
according to the invention can correspond to all or a portion of a protein according to the 
invention. In one embodiment, a fusion protein comprises at least one biologically active 

25 portion of a protein according to the invention. In another embodiment, a fusion protein 
comprises at least two biologically active portions of a protein according to the invention. 
Within the fusion protein, the term "operatively linked" is intended to indicate that the 
polypeptide according to the invention and the other polypeptide are fused in-frame to 
each other. The polypeptide can be fused to the N-terminus or C-terminus, or to the 

30 middle. 
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For example, in one embodiment a fusion protein comprises a polypeptide 
according to the invention operably linked to the extracellular domain of a second 
protein. 

In another embodiment, the fusion protein is a GST-fusion protein in which the 
5 polypeptide sequences of the invention are fused to the C-terminus of the GST (i.e., 
glutathione S-transferase) sequences. 

In another embodiment, the fusion protein is an immunoglobulin fusion protein in 
which the polypeptide sequences according to the invention comprise one or more 
domains fused to sequences derived from a member of the immunoglobulin protein 

10 family. The immunoglobulin fusion proteins of the invention can be incorporated into 
pharmaceutical compositions and administered to a subject to inhibit an interaction 
between a ligand and a protein of the invention on the surface of a cell, to thereby 
suppress signal transduction in vivo. The immunoglobulin fusion proteins can be used to 
affect the bioavailability of a cognate ligand. Inhibition of the ligand/protein interaction 

15 may be useful therapeutically for both the treatment of proliferative and differentiative 
disorders, e.g., cancer as well as modulating (e.g., promoting or inhibiting) cell survival. 
Moreover, the immunoglobulin fusion proteins of the invention can be used as 
immunogens to produce antibodies in a subject, to purify ligands, and in screening assays 
to identify molecules that inhibit the interaction of a polypeptide of the invention with a 

20 ligand. 

A chimeric or fusion protein of the invention can be produced by standard 
recombinant DNA techniques. For example, DNA fragments coding for the different 
polypeptide sequences are ligated together in-frame in accordance with conventional 
techniques, e.g., by employing blunt-ended or stagger-ended termini for ligation, 

25 restriction enzyme digestion to provide for appropriate termini, fiUing-in of cohesive ends 
as appropriate, alkaline phosphatase treatment to avoid undesirable joining, and 
enzymatic ligation. In another embodiment, the fusion gene can be synthesized by 
conventional techniques including automated DNA synthesizers. Alternatively, PCR 
amplification of gene fragments can be carried out using anchor primers that give rise to 

30 complementary overhangs between two consecutive gene fragments that can 

subsequently be annealed and reamplified to generate a chimeric gene sequence (see, for 
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example, Ausubel et al. (eds.) Current Protocols in Molecular Biology, John 
Wiley & Sons, 1992). Moreover, many expression vectors are commercially available 
that already encode a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a 
polypeptide of the invention can be cloned into such an expression vector such that die 
5 fusion moiety is linked in-frame to the protein of the invention. 

4.8 GENE THERAPY 

Mutations in the polynucleotides of the invention gene may result in loss of 
normal function of the encoded protein. The invention thus provides gene therapy to 

10 restore normal activity of the polypeptides of the invention; or to treat disease states 
involving polypeptides of the invention. Delivery of a functional gene encoding 
polypeptides of the invention to appropriate cells is effected ex vivo, in situ, or in vivo by 
use of vectors, and more particularly viral vectors (e.g., adenovirus, adeno-associated 
virus, or a retrovirus), or ex vivo by use of physical DNA transfer methods (e.g., 

15 liposomes or chemical treatments). See, for example, Anderson, Nature, supplement to 
vol. 392, no. 6679, pp.25-20 (1998). For additional reviews of gene therapy technology 
see Friedmann, Science, 244: 1275-1281 (1989); Verma, Scientific American: 68-84 
(1990); and Miller, Nature, 357: 455-460 (1992). Introduction of any one of the 
nucleotides of the present invention or a gene encoding the polypeptides of the present 

20 invention can also be accomplished with extrachromosomal substrates (transient 

expression) or artificial chromosomes (stable expression). Cells may also be cultured ex 
vivo in the presence of proteins of the present invention in order to proliferate or to 
produce a desired effect on or activity in such cells. Treated cells can then be introduced 
in vivo for therapeutic purposes. Alternatively, it is contemplated that in other human 

25 disease states, preventing the expression of or inhibiting the activity of polypeptides of 
the invention will be useful in treating the disease states. It is contemplated that antisense 
therapy or gene therapy could be applied to negatively regulate the expression of 
polypeptides of the invention. 

Other methods inhibiting expression of a protein include the introduction of 

30 antisense molecules to the nucleic acids of the present invention, their complements, or their 
translated RNA sequences, by methods known in the art. Further, the polypeptides of the 
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present invention can be inhibited by using targeted deletion methods, or the insertion of a 
negative regulatory element such as a silencer, which is tissue specific. 

The present invention still further provides cells genetically engineered in vivo to 
express the polynucleotides of the invention, wherein such polynucleotides are in operative 
5 association with a regulatory sequence heterologous to the host cell which drives expression 
of the polynucleotides in the cell. These methods can be used to increase or decrease the 
expression of the polynucleotides of the present invention. 

Knowledge of DNA sequences provided by the invention allows for modification of 
cells to permit, increase, or decrease, expression of endogenous polypeptide. Cells can be 

10 modified (e.g., by homologous recombination) to provide increased polypeptide expression 
by replacing, in whole or in part, the naturally occurring promoter with all or part of a 
heterologous promoter so that the cells express the protein at higher levels. The heterologous 
promoter is inserted in such a manner that it is operatively linked to the desired protein 
encoding sequences. See, for example, PCT International Publication No. WO 94/12650, 

1 5 PCT International Publication No. WO 92/20808, and PCT International Publication No. 
WO 91/09955. It is also contemplated that, in addition to heterologous promoter DNA, 
amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional CAD gene which encodes 
caibamyl phosphate synthase, aspartate transcarbamylase, and dihydroorotase) and/or intron 
DNA may be inserted along with the heterologous promoter DNA. If linked to the desired 

20 protein coding sequence, amplification of the marker DNA by standard selection methods 
results in co-amplification of the desired protein coding sequences in the cells. 

In another embodiment of the present invention, cells and tissues may be engineered 
to express an endogenous gene comprising the polynucleotides of the invention under the 
control of inducible regulatory elements, in which case the regulatory sequences of the 

25 endogenous gene may be replaced by homologous recombination. As described herein, 
gene targeting can be used to replace a gene's existing regulatory region with a regulatory 
sequence isolated from a different gene or a novel regulatory sequence synthesized by 
genetic engineering methods. Such regulatory sequences may be comprised of promoters, 
enhancers, scaffold-attachment regions, negative regulatory elements, transcriptional 

30 initiation sites, regulatory protein binding sites or combinations of said sequences. 
Alternatively, sequences which affect the structure or stability of the RNA or protein 
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produced may be replaced, removed, added, or otherwise modified by targeting. These 
sequences include polyadenylation signals, mRNA stability elements, splice sites, leader 
sequences for enhancing or modifying transport or secretion properties of the protein, or 
other sequences which alter or improve the function or stability of protein or RNA 
5 molecules. 

The targeting event may be a simple insertion of the regulatory sequence, placing the 
gene under the control of the new regulatory sequence, e.g., inserting a new promoter or 
enhancer or both upstream of a gene. Alternatively, the targeting event may be a simple 
deletion of a regulatory element, such as the deletion of a tissue-specific negative regulatory 

10 element. Alternatively, the targeting event may replace an existing element; for example, a 
tissue-specific enhancer can be replaced by an enhancer that has broader or different 
cell-type specificity than the naturally occurring elements. Here, the naturally occurring 
sequences are deleted and new sequences are added In all cases, the identification of the 
targeting event may be facilitated by the use of one or more selectable marker genes that are 

15 contiguous with the targeting DNA, allowing for the selection of cells in which the 

exogenous DNA has integrated into the cell genome. The identification of the targeting 
event may also be facilitated by the use of one or more marker genes exhibiting the property 
of negative selection, such that the negatively selectable marker is linked to the exogenous 
DNA, but configured such that the negatively selectable marker flanks the targeting 

20 sequence, and such that a connect homologous recombination event with sequences in the 
host cell genome does not result in the stable integration of the negatively selectable marker. 
Markers useful for this purpose include the Herpes Simplex Virus thymidine kinase (TK) 
gene or the bacterial xanthine-guanine phosphoribosyl-transferase (gpt) gene. 

The gene targeting or gene activation techniques which can be used in accordance 

25 with this aspect of the invention are more particularly described in U.S. Patent No. 

5,272,071 to Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International Application 
No. PCT/US92/09627 (WO93/09222) by Selden et aL; and International Application No. 
PCT/US90/06436 (W09 1/06667) by Skoultchi et al., each of which is incorporated by 
reference herein in its entirety. 

30 

4.9 TRANSGENIC ANIMALS 
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In preferred methods to determine biological functions of the polypeptides of the- 
invention in vivo, one or more genes provided by the invention are either over expressed 
or inactivated in the germ line of animals using homologous recombination [Capecchi, 
Science 244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the 
5 regulatory control of exogenous or endogenous promoter elements, are known as 
transgenic animals. Animals in which an endogenous gene has been inactivated by 
homologous recombination are referred to as "knockout" animals. Knockout animals, 
preferably non-human mammals, can be prepared as described in U.S. Patent No. 
5,557,032, incorporated herein by reference. Transgenic animals are useful to determine 

10 the roles polypeptides of the invention play in biological processes, and preferably in 
disease states. Transgenic animals are useful as model systems to identify compounds 
that modulate lipid metabolism. Transgenic animals, preferably non-human mammals, 
are produced using methods as described in U.S. Patent No 5,489,743 and PCT 
Publication No. W094/28 1 22, incorporated herein by reference. 

15 Transgenic animals can be prepared wherein all or part of a promoter of the 

polynucleotides of the invention is either activated or inactivated to alter the level of 
expression of the polypeptides of the invention. Inactivation can be carried out using 
homologous recombination methods described above. Activation can be achieved by 
supplementing or even replacing the homologous promoter to provide for increased 

20 protein expression. Hie homologous promoter can be supplemented by insertion of one 
or more heterologous enhancer elements known to confer promoter activation in a 
particular tissue. 

The polynucleotides of the present invention also make possible the development, 
through, e.g., homologous recombination or knock out strategies, of animals that fail to 
25 express polypeptides of the invention or that express a variant polypeptide. Such animals 
are useful as models for studying the in vivo activities of polypeptide as well as for 
studying modulators of the polypeptides of the invention. 

In preferred methods to determine biological functions of the polypeptides of the 
invention in vivo, one or more genes provided by the invention are either over expressed 
30 or inactivated in the germ line of animals using homologous recombination [Capecchi, 
Science 244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the 
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regulatory control of exogenous or endogenous promoter elements, are known as 
transgenic animals. Animals in which an endogenous gene has been inactivated by 
homologous recombination are referred to as "knockout" animals. Knockout animals, 
preferably non-human mammals, can be prepared as described in U.S. Patent No. 
5 5,557,032, incorporated herein by reference. Transgenic animals are useful to determine 
the roles polypeptides of the invention play in biological processes, and preferably in 
disease states. Transgenic animals are useful as model systems to identify compounds 
that modulate lipid metabolism. Transgenic animals, preferably non-human mammals, 
are produced using methods as described in U.S. Patent No 5,489,743 and PCT 

10 Publication No. W094/28 122, incorporated herein by reference. 

Transgenic animals can be prepared wherein all or part of the polynucleotides of 
the invention promoter is either activated or inactivated to alter the level of expression of 
the polypeptides of the invention. Inactivation can be carried out using homologous 
recombination methods described above. Activation can be achieved by supplementing or 

15 even replacing the homologous promoter to provide for increased protein expression. The 
homologous promoter can be supplemented by insertion of one or more heterologous 
enhancer elements known to confer promoter activation in a particular tissue. 

4.10 USES AND BIOLOGICAL ACTIVITY 

20 The polynucleotides and proteins of the present invention are expected to exhibit 

one or more of the uses or biological activities (including those associated with assays 
cited herein) identified herein. Uses or activities described for proteins of the present 
invention may be provided by administration or use of such proteins or of 
polynucleotides encoding such proteins (such as, for example, in gene therapies or 

25 vectors suitable for introduction of DNA). The mechanism underlying the particular 
condition or pathology will dictate whether the polypeptides of the invention, the 
polynucleotides of the invention or modulators (activators or inhibitors) thereof would be 
beneficial to the subject in need of treatment. Thus, "therapeutic compositions of the , 
invention" include compositions comprising isolated polynucleotides (including 

30 . recombinant DNA molecules, cloned genes and degenerate variants thereof) or 
polypeptides of the invention (including full length protein, mature protein and 
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truncations or domains thereof), or compounds and other substances that modulate the 
overall activity of the target gene products, either at the level of target gene/protein 
expression or target protein activity. Such modulators include polypeptides, analogs, 
(variants), including fragments and fusion proteins, antibodies and other binding proteins; 
5 chemical compounds that directly or indirectly activate or inhibit the polypeptides of the 
invention (identified, e.g., via drug screening assays as described herein); antisense 
polynucleotides and polynucleotides suitable for triple helix formation; and in particular 
antibodies or other binding partners that specifically recognize one or more epitopes of 
the polypeptides of the invention. 
10 The polypeptides of the present invention may likewise be involved in cellular 

activation or in one of the other physiological pathways described herein. 

4.10.1 RESEARCH USES AND UTILITIES 

The polynucleotides provided by the present invention can be used by the 

15 research community for various purposes. The polynucleotides can be used to express 
recombinant protein for analysis, characterization or therapeutic use; as markers for 
tissues in which the corresponding protein is preferentially expressed (either 
constitutively or at a particular stage of tissue differentiation or development or in disease 
states); as molecular weight markers on gels; as chromosome markers or tags (when 

20 labeled) to identify chromosomes or to map related gene positions; to compare with 

endogenous DNA sequences in patients to identify potential genetic disorders; as probes 
to hybridize and thus discover novel, related DNA sequences; as a source of information 
to derive PCR primers for genetic fingerprinting; as a probe to "subtract-out" known 
sequences in the process of discovering other novel polynucleotides; for selecting and 

25 making oligomers for attachment to a "gene chip" or other support, including for 
examination of expression patterns; to raise anti-protein antibodies using DNA 
immunization techniques; and as an antigen to raise anti-DNA antibodies or elicit another 
immune response. Where the polynucleotide encodes a protein which binds or 
potentially binds to another protein (such as, for example, in a receptor-ligand 

30 interaction), the polynucleotide can also be used in interaction trap assays (such as, for 
example, that described in Gyuris et al, Cell 75:791-803 (1993)) to identify 
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polynucleotides encoding the other protein with which binding occurs or to identify 
inhibitors of the binding interaction. 

The polypeptides provided by the present invention can similarly be used in 
assays to determine biological activity, including in a panel of multiple proteins for 
5 high-throughput screening; to raise antibodies or to elicit another immune response; as a 
reagent (including the labeled reagent) in assays designed to quantitatively determine 
levels of the protein (or its receptor) in biological fluids; as markers for tissues in which 
the corresponding polypeptide is preferentially expressed (either constitutively or at a 
particular stage of tissue differentiation or development or in a disease state); and, of 

10 course, to isolate correlative receptors or ligands. Proteins involved in these binding 

interactions can also be used to screen for peptide or small molecule inhibitors or agonists 
of the binding interaction. 

Any or all of these research utilities are capable of being developed into reagent 
grade or kit format for commercialization as research products. 

15 Methods for performing the uses listed above are well known to those skilled in 

the art. References disclosing such methods include without limitation "Molecular 
Cloning: A Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, 
Sambrook, J., E. F. Fritsch and T. Maniatis eds., 1989, and "Methods in Enzymology: 
Guide to Molecular Cloning Techniques", Academic Press, Berger, S. L. and A. R. 

20 Kimmel eds., 1987. 

4.10.2 NUTRITIONAL USES 

Polynucleotides and polypeptides of the present invention can also be used as 
nutritional sources or supplements. Such uses include without limitation use as a protein or 

25 amino acid supplement, use as a carbon source, use as a nitrogen source and use as a source 
of carbohydrate. In such cases the polypeptide or polynucleotide of the invention can be 
added to the feed of a particular organism or can be administered as a separate solid or liquid 
preparation, such as in the form of powder, pills, solutions, suspensions or capsules. In the 
case of microorganisms, the polypeptide or polynucleotide of the invention can be added to 

30 the medium in or on which the microorganism is cultured. 
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4.10.3 CYTOKINE AND CELL PROLIFERATION/DIFFERENTIATION 
ACTIVITY 

A polypeptide of the present invention may exhibit activity relating to cytokine, 
cell proliferation (either inducing or inhibiting) or cell differentiation (either inducing or 
5 inhibiting) activity or may induce production of other cytokines in certain cell 

populations. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Many protein factors discovered to date, including all known cytokines, have 
exhibited activity in one or more factor-dependent cell proliferation assays, and hence the 
assays serve as a convenient confirmation of cytokine activity. The activity of therapeutic 

10 compositions of the present invention is evidenced by any one of a number of routine 

factor dependent cell proliferation assays for cell lines including, without limitation, 32D, 
DA2, DA1G, T10, B9, B9/1 1, BaF3, MC9/G, M+(preB M+), 2E8, RB5, DAI, 123, 
T1165,HT2,CTLL2,TF-l,Mo7e, CMK, HUVEC, and Caco. Therapeutic compositions 
of the invention can be used in the following: 

15 Assays for T-cell or thymocyte proliferation include without limitation those 

described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. 
H. Margulies, E. M. Shevach* W. Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; 
Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 137:3494-3500, 

20 1986; Bertagnolli et al., J. Immunol. 145:1706-1712, 1990; Bertagnolli et al., Cellular 
Immunology 133:327-341, 1991; Bertagnolli, et al., I. Immunol. 149:3778-3783, 1992; 
Bowman et al., L Immunol. 152:1756-1761, 1994, 

Assays for cytokine production and/or proliferation of spleen cells, lymph node 
cells or thymocytes include, without limitation, those described in: Polyclonal T cell 

25 stimulation, Kruisbeek, A. M. and Shevach, R M. In Current Protocols in Immunology. 
J. E. e.a. Coligan eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and 
Measurement of mouse and human interleukin-Y, Schreiber, R. D. In Current Protocols in 
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 
1994. 

30 Assays for proliferation and differentiation of hematopoietic and lymphopoietic 

cells include, without limitation, those described in: Measurement of Human and Murine 
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Interleukin 2 and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current 
Protocols in Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and 
Sons, Toronto. 1991; deVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau et al., 
Nature 336:690-692, 1988; Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 
5 80:2931-2938, 1983; Measurement of mouse and human interleukin 6-Nordan, R. In 
Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.6.1-6.6.5, John Wiley 
and Sons, Toronto. 1991; Smith et al., Proc. Natl. Aced. Sci. U.S.A. 83:1857-1861, 1986; 
Measurement of human Interleukin 11-Bennett, F., Giannotti, J., Clark, S. C. and Turner, 
K. J. In Current Protocols in Immunology. J. E. Coligan eds. Vol 1 pp. 6.15.1 John 

10 Wiley and Sons, Toronto. 1991 ; Measurement of mouse and human Interleukin 
9-Ciarletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current Protocols in 
Immunology. J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991. 

Assays for T-cell clone responses to antigens (which will identify, among others, 
proteins that affect APC-T cell interactions as well as direct T-cell effects by measuring 

15 proliferation and cytokine production) include, without limitation, those described in: 
CuiTent Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. 
Margulies, E. M. Shevach, W Strober, Pub. Greene Publishing Associates and 
Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte Function; Chapter 
6, Cytokines and their cellular receptors; Chapter 7, Immunologic studies in Humans); 

20 Weinberger et al., Proc. Natl. Acad. Sci. USA 77:6091-6095, 1980; Weinberger et al., 
Eur. J. Immun. 11:405-411, 1981; Takai et al., J. Immunol. 137:3494-3500, 1986; Takai 
eta]., J.Immunol. 140:508-512, 1988. 

4.10,4 STEM CELL GROWTH FACTOR ACTIVITY 

25 A polypeptide of the present invention may exhibit stem cell growth factor 

activity and be involved in the proliferation, differentiation and survival of pluripotent 
and totipotent stem cells including primordial germ cells, embryonic stem cells, 
hematopoietic stem cells and/or germ line stem cells. Administration of the polypeptide 
of the invention to stem cells in vivo or ex vivo is expected to maintain and expand cell 

30 populations in a totipotential or pluripotential state which would be useful for re- 
engineering damaged or diseased tissues, transplantation, manufacture of bio- 
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pharmaceuticals and the development of bio-sensors. The ability to produce large 
quantities of human cells has important working applications for the production of human 
proteins which currently must be obtained from non-human sources or donors, 
implantation of cells to treat diseases such as Parkinson's, Alzheimer's and other 
5 neurodegenerative diseases; tissues for grafting such as bone marrow, skin, cartilage, 
tendons, bone, muscle (including cardiac muscle), blood vessels, cornea, neural cells, 
gastrointestinal cells and others; and organs for transplantation such as kidney, liver, 
pancreas (including islet cells), heart and lung. 

It is contemplated that multiple different exogenous growth factors and/or 

10 cytokines may be administered in combination with the polypeptide of the invention to 
achieve the desired effect, including any of the growth factors listed herein, other stem 
cell maintenance factors, and specifically including stem cell factor (SCF), leukemia 
inhibitory factor (UF), Flt-3 ligand (Flt-3L), any of the interleukins, recombinant soluble 
IL-6 receptor fused to IL-6, macrophage inflammatory protein 1-alpha (MDM-alpha), G- 

15 CSF, GM-CSF, thrombopoietin (TPO), platelet factor 4 (PF-4), platelet-derived growth 
factor (PDGF), neural growth factors and basic fibroblast growth factor (bFGF). 

Since totipotent stem cells can give rise to virtually any mature cell type, 
expansion of these cells in culture will facilitate the production of large quantities of 
mature cells. Techniques for cuituring stem cells are known in the art and administration 

20 of polypeptides of the invention, optionally with other growth factors and/or cytokines, is 
expected to enhance the survival and proliferation of the stem cell populations. This can 
be accomplished by direct administration of the polypeptide of the invention to the 
culture medium. Alternatively, stroma cells transfected with a polynucleotide that 
encodes for the polypeptide of the invention can be used as a feeder layer for the stem 

25 cell populations in culture or in vivo. Stromal support cells for feeder layers may include 
embryonic bone marrow fibroblasts, bone marrow stromal cells, fetal liver cells, or 
cultured embryonic fibroblasts (see U.S . Patent No. 5,690,926). 

Stem cells themselves can be transfected with a polynucleotide of the invention to 
induce autocrine expression of the polypeptide of the invention. This will allow for 

30 generation of undifferentiated totipotential/pluripotential stem cell lines that are useful as 
is or that can then be differentiated into the desired mature cell types. These stable cell 
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lines can also serve as a source of undifferentiated totipotential/pluripotential mRNA to 
create cDNA libraries and templates for polymerase chain reaction experiments. These 
studies would allow for the isolation and identification of differentially expressed genes 
in stem cell populations that regulate stem cell proliferation and/or maintenance. 
5 Expansion and maintenance of totipotent stem cell populations will be useful in 

the treatment of many pathological conditions. For example, polypeptides of the present 
invention may be used to manipulate stem cells in culture to give rise to neuroepithelial 
cells that can be used to augment or replace cells damaged by illness, autoimmune 
disease, accidental damage or genetic disorders. The polypeptide of the invention may be 

10 useful for inducing the proliferation of neural cells and for the regeneration of nerve and 
brain tissue, i.e. for the treatment of central and peripheral nervous system diseases and 
neuropathies, as well as mechanical and traumatic disorders which involve degeneration, 
death or trauma to neural cells or nerve tissue. In addition, the expanded stem cell 
populations can also be genetically altered for gene therapy puiposes and to decrease host 

15 rejection of replacement tissues after grafting or implantation. 

Expression of the polypeptide of the invention and its effect on stem cells can also 
be manipulated to achieve controlled differentiation of the stem cells into more 
differentiated cell types. A broadly applicable method of obtaining pure populations of a 
specific differentiated cell type from undifferentiated stem cell populations involves the 

20 use of a cell-type specific promoter driving a selectable marker. The selectable marker 
allows only cells of the desired type to survive. For example, stem cells can be induced 
to differentiate into cardiomyocytes (Wobus et al., Differentiation, 48: 173-182, (1991); 
Klug et al., J. Clin. Invest, 98(1): 216-224, (1998)) or skeletal muscle cells (Browder, L. 
W. In: Principles of Tissue Engineering eds. Lanza et al., Academic Press (1997)). 

25 Alternatively, directed differentiation of stem cells can be accomplished by culturing the 
stem cells in the presence of a differentiation factor such as retinoic acid and an 
antagonist of the polypeptide of the invention which would inhibit the effects of 
endogenous stem cell factor activity and allow differentiation to proceed. 

In vitro cultures of stem cells can be used to determine if the polypeptide of the 

30 invention exhibits stem cell growth factor activity. Stem cells are isolated from any one 
of various cell sources (including hematopoietic stem cells and embryonic stem cells) and 
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cultured on a feeder layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 
92: 7844-7848 (1995), in the presence of the polypeptide of the invention alone or in 
combination with other growth factors or cytokines. The ability of the polypeptide of the 
invention to induce stem cells proliferation is determined by colony formation on semi- 
5 solid support e.g. as described by Bernstein et al., Blood, 77: 2316-2321 (1991). 

410.5 HEMATOPOIESIS REGULATING ACTIVITY 

A polypeptide of the present invention may be involved in regulation of 
hematopoiesis and, consequently, in the treatment of myeloid or lymphoid cell disorders. 

10 Even marginal biological activity in support of colony forming cells or of 

factor-dependent cell lines indicates involvement in regulating hematopoiesis, e.g. in 
supporting the growth and proliferation of erythroid progenitor cells alone or in 
combination with other cytokines, thereby indicating utility, for example, in treating 
various anemias or for use in conjunction with irradiation/chemotherapy to stimulate the 

15 production of erythroid precursors and/or erythroid cells; in supporting the growth and 
proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i.e., 
traditional CSF activity) useful, for example, in conjunction with chemotherapy to 
prevent or treat consequent myelo-suppression; in supporting the growth and proliferation 
of megakaryocytes and consequently of platelets thereby allowing prevention or 

20 treatment of various platelet disorders such as thrombocytopenia, and generally for use in 
place of or complimentary to platelet transfusions; and/or in supporting the growth and 
proliferation of hematopoietic stem cells which are capable of maturing to any and all of 
the above-mentioned hematopoietic cells and therefore find therapeutic utility in various 
stem cell disorders (such as those usually treated with transplantation, including, without 

25 limitation, aplastic anemia and paroxysmal nocturnal hemoglobinuria), as well as in 
repopulating the stem cell compartment post irradiation/chemotherapy, either in-vivo or 
ex-vivo (i.e., in conjunction with bone marrow transplantation or with peripheral 
progenitor cell transplantation (homologous or heterologous)) as normal cells or 
genetically manipulated for gene therapy. 

30 Therapeutic compositions of the invention can be used in the following: 
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Suitable assays for proliferation and differentiation of various hematopoietic lines 
are cited above. 

Assays for embryonic stem cell differentiation (which will identify, among others, 
proteins that influence embryonic differentiation hematopoiesis) include, without 
5 limitation, those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller 
et al., Molecular and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 
81:2903-2915, 1993. 

Assays for stem cell survival and differentiation (which will identify, among 
others, proteins that regulate lympho-hematopoiesis) include, without limitation, those 

10 described in: Methylcellulose colony forming assays, Freshney, M. G. In Culture of 
Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 265-268, Wiley-liss, Inc., New 
York, N.Y. 1994; Hirayama et al., Proc. Natl. Acad. Sci. USA 89:5907-5911, 1992; 
Primitive hematopoietic colony forming cells with high proliferative potential, McNiece, 
I. K. and Briddell, R. A. In Culture of Hematopoietic Cells. R. I. Freshney, et al, eds. Vol 

15 pp. 23-39, Wiley-liss, Inc., New York, N.Y. 1994; Neben et al., Experimental 

Hematology 22:353-359, 1994; Cobblestone area forming cell assay, Ploemacher, R. E. 
In Culture of Hematopoietic Cells. R. I. Freshney, et al. eds. Vol pp. 1-21, Wiley-Liss, 
Inc., New York, N.Y. 1994; Long term bone marrow cultures in the presence of stromal 
cells, Spooncer, E., Dexter, M. and Allen, T. In Culture of Hematopoietic Cells. R. L 

20 Freshney, et al. eds. Vol pp. 163-179, Wiley-Liss, foe, New York, N.Y. 1994; Long term 
culture initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R. I. 
Freshney, et al. eds. Vol pp. 139-162, Wiley-liss, Inc., New Yoric, N.Y. 1994. 

4.10.6 TISSUE GROWTH ACTIVITY 

25 A polypeptide of the present invention also may be involved in bone, cartilage, 

tendon, ligament and/or nerve tissue growth or regeneration, as well as in wound healing 
and tissue repair and replacement, and in healing of bums, incisions and ulcers. 

A polypeptide of the present invention which induces cartilage and/or bone 
growth in circumstances where bone is not normally formed, has application in the 

30 healing of bone fractures and cartilage damage or defects in humans and other animals. 
Compositions of a polypeptide, antibody, binding partner, or other modulator of the 
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invention may have prophylactic use in closed as well as open fracture reduction and also 
in the improved fixation of artificial joints. De novo bone formation induced by an 
osteogenic agent contributes to the repair of congenital, trauma induced, or oncologic 
resection induced craniofacial defects, and also is useful in cosmetic plastic surgery. 
5 A polypeptide of this invention may also be involved in attracting bone-forming 

cells, stimulating growth of bone-forming cells, or inducing differentiation of progenitors 
of bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone degenerative 
disorders, or periodontal disease, such as through stimulation of bone and/or cartilage 
repair or by blocking inflammation or processes of tissue destruction (collagenase 

10 activity, osteoclast activity, etc.) mediated by inflammatory processes may also be 
possible using the composition of the invention. 

Another category of tissue regeneration activity that may involve the polypeptide 
of the present invention is tendon/ligament formation. Induction of tendon/ligament-like 
tissue or other tissue formation in circumstances where such tissue is not normally 

15 formed, has application in the healing of tendon or ligament tears, deformities and other 
tendon or ligament defects in humans and other animals. Such a preparation employing a 
tendon/ligament-like tissue inducing protein may have prophylactic use in preventing 
damage to tendon or ligament tissue, as well as use in the improved fixation of tendon or 
ligament to bone or other tissues, and in repairing defects to tendon or ligament tissue. 

20 De novo tendon/ligament-like tissue formation induced by a composition of the present 
invention contributes to the repair of congenital, trauma induced, or other tendon or 
ligament defects of other origin, and is also useful in cosmetic plastic surgery for 
attachment or repair of tendons or ligaments. The compositions of the present invention 
may provide environment to attract tendon- or ligament-forming cells, stimulate growth 

25 of tendon- or ligament-forming cells, induce differentiation of progenitors of tendon- or 
ligament-forming cells, or induce growth of tendon/ligament cells or progenitors ex vivo 
for return in vivo to effect tissue repair. The compositions of the invention may also be 
useful in the treatment of tendinitis, carpal tunnel syndrome and other tendon or ligament 
defects. The compositions may also include an appropriate matrix and/or sequestering 

30 agent as a carrier as is well known in the art. 
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The compositions of the present invention may also be useful for proliferation of 
neural cells and for regeneration of nerve and brain tissue, i.e. for the treatment of central 
and peripheral nervous system diseases and neuropathies, as well as mechanical and 
traumatic disorders, which involve degeneration, death or trauma to neural cells or nerve 
5 tissue. More specifically, a composition may be used in the treatment of diseases of the 
peripheral nervous system, such as peripheral nerve injuries, peripheral neuropathy and 
localized neuropathies, and central nervous system diseases, such as Alzheimer's, 
Parkinson's disease, Huntington's disease, amyotrophic lateral sclerosis, and Shy-Drager 
syndrome. Further conditions which may be treated in accordance with the present 
10 invention include mechanical and traumatic disorders, such as spinal cord disorders, head 
trauma and cerebrovascular diseases such as stroke. Peripheral neuropathies resulting 
from chemotherapy or other medical therapies may also be treatable using a composition 
of the invention. 

Compositions of the invention may also be useful to promote better or faster 
15 closure of non-healing wounds, including without limitation pressure ulcers, ulcers 
associated with vascular insufficiency, surgical and traumatic wounds, and the like. 

Compositions of the present invention may also be involved in the generation or 
regeneration of other tissues, such as organs (including, for example, pancreas, liver, 
intestine, kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular 
20 (including vascular endothelium) tissue, or for promoting the growth of cells comprising 
such tissues. Part of the desired effects may be by inhibition or modulation of fibrotic 
scarring may allow normal tissue to regenerate. A polypeptide of the present invention 
may also exhibit angiogenic activity. 

A composition of the present invention may also be useful for gut protection or 
25 regeneration and treatment of lung or liver fibrosis, reperfusion injury in various tissues, 
and conditions resulting from systemic cytokine damage. 

A composition of the present invention may also be useful for promoting or 
inhibiting differentiation of tissues described above from precursor tissues or cells; or for 
inhibiting the growth of tissues described above. 
30 Therapeutic compositions of the invention can be used in the following: 
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Assays for tissue generation activity include, without limitation, those described 
in: International Patent Publication No. WO95/16035 (bone, cartilage, tendon); 
International Patent Publication No. WO95/05846 (nerve, neuronal); International Patent 
Publication No. WO91/07491 (skin, endothelium). 
5 Assays for wound healing activity include, without limitation, those described in: 

Winter, Epidermal Wound Healing, pps. 71-112 (Maibach, H. L and Rovee, D. T., eds.), 
Year Book Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J. 
Invest Dermatol 71:382-84 (1978). 



10 4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY 

A polypeptide of the present invention may also exhibit immune stimulating or 
immune suppressing activity, including without limitation the activities for which assays 
are described herein. A polynucleotide of the invention can encode a polypeptide 
exhibiting such activities. A protein may be useful in the treatment of various immune 

15 deficiencies and disorders (including severe combined immunodeficiency (SCK))), e.g., 
in regulating (up or down) growth and proliferation of T and/or B lymphocytes, as well as 
effecting the cytolytic activity of NK cells and other cell populations. These immune 
deficiencies may be genetic or be caused by viral (e.g., HIV) as well as bacterial or 
fungal infections, or may result from autoimmune disorders. More specifically, infectious 

20 diseases causes by viral, bacterial, fungal or other infection may be treatable using a 
protein of the present invention, including infections by HIV, hepatitis viruses, herpes 
viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal infections such 
as candidiasis. Of course, in this regard, proteins of the present invention may also be 
useful where a boost to the immune system generally may be desirable, i.e., in the 

25 treatment of cancer. 

Autoimmune disorders which may be treated using a protein of the present 
invention include, for example, connective tissue disease, multiple sclerosis, systemic 
lupus erythematosus, rheumatoid arthritis, autoimmune pulmonary inflammation, 
Guillain-Barre syndrome, autoimmune thyroiditis, insulin dependent diabetes mellitis, 

30 myasthenia gravis, graft-versus-host disease and autoimmune inflammatory eye disease. 
Such a protein (or antagonists thereof, including antibodies) of the present invention may 
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also to be useful in the treatment of allergic reactions and conditions (e.g., anaphylaxis, 
serum sickness, drug reactions, food allergies, insect venom allergies, mastocytosis, 
allergic rhinitis, hypersensitivity pneumonitis, urticaria, angioedema, eczema, atopic 
dermatitis, allergic contact dermatitis, erythema multiforme, Stevens Johnson syndrome, 

5 allergic conjunctivitis, atopic keratoconjunctivitis, venereal keratoconjunctivitis, giant 
papillary conjunctivitis and contact allergies), such as asthma (particularly allergic 
asthma) or other respiratory problems. Other conditions, in which immune suppression is 
desired (including, for example, organ transplantation), may also be treatable using a 
protein (or antagonists thereof) of the present invention. The therapeutic effects of the 

10 polypeptides or antagonists thereof on allergic reactions can be evaluated by in vivo 
animals models such as the cumulative contact enhancement test (Lastbom et al., 
Toxicology 125: 59-66, 1998), skin prick test (Hoffmann et al., Allergy 54: 446-54, 
1999), guinea pig skin sensitization test (Vohr et al., Arch. Toxocol. 73: 501-9), and 
murine local lymph node assay (Kimber et al., J. Toxicol. Environ. Health 53: 563-79). 

15 Using the proteins of the invention it may also be possible to modulate immune 

responses, in a number of ways. Down regulation may be in the form of inhibiting or 
blocking an immune response already in progress or may involve preventing the 
induction of an immune response. The functions of activated T cells may be inhibited by 
suppressing T cell responses or by inducing specific tolerance in T cells, or both. 

20 Immunosuppression of T cell responses is generally an active, non-antigen-specific, 
process which requires continuous exposure of the T cells to the suppressive agent. 
Tolerance, which involves inducing non-responsiveness or anergy in T cells, is 
distinguishable from immunosuppression in that it is generally antigen-specific and 
persists after exposure to the tolerizing agent has ceased. Operationally, tolerance can be 

25 demonstrated by the lack of a T cell response upon reexposure to specific antigen in the 
absence of the tolerizing agent. 

Down regulating or preventing one or more antigen functions (including without 
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g., preventing 
high level lymphokine synthesis by activated T cells, will be useful in situations of tissue, 

30 skin and organ transplantation and in graft-versus-host disease (GVHD). For example, 
blockage of T cell function should result in reduced tissue destruction in tissue 
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transplantation. Typically, in tissue transplants, rejection of the transplant is initiated 
through its recognition as foreign by T cells, followed by an immune reaction that 
destroys the transplant. The administration of a therapeutic composition of the invention 
may prevent cytokine synthesis by immune cells, such as T cells, and thus acts as an 
5 immunosuppressant. Moreover, a lack of costimulation may also be sufficient to anergize 
the T cells, thereby inducing tolerance in a subject. Induction of long-term tolerance by B 
lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration 
of these blocking reagents. To achieve sufficient immunosuppression or tolerance in a 
subject, it may also be necessary to block the function of a combination of B lymphocyte 
10 antigens. 

The efficacy of particular therapeutic compositions in preventing organ transplant 
rejection or GVHD can be assessed using animal models that are predictive of efficacy in 
humans. Examples of appropriate systems which can be used include allogeneic cardiac 
grafts in rats and xenogeneic pancreatic islet cell grafts in mice, both of which have been 

15 used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as 
described in Lenschow et al., Science 257:789-792 (1992) andTurka et al., Proc. Natl. 
Acad. Sci USA, 89: 1 1 102-1 1 105 (1992). In addition, murine models of GVHD (see Paul 
ed., Fundamental Immunology, Raven Press, New York, 1989, pp. 846-847) can be used 
to determine the effect of therapeutic compositions of the invention on the development 

20 of that disease. 

Blocking antigen function may also be therapeutically useful for treating 
autoimmune diseases. Many autoimmune disorders are the result of inappropriate 
activation of T cells that are reactive against self tissue and which promote the production 
of cytokines and autoantibodies involved in the pathology of the diseases. Preventing the 

25 activation of autoreactive T cells may reduce or eliminate disease symptoms. 

Administration of reagents which block stimulation of T cells can be used to inhibit T 
cell activation and prevent production of autoantibodies or T cell-derived cytokines 
which may be involved in the disease process. Additionally, blocking reagents may 
induce antigen-specific tolerance of autoreactive T cells which could lead to long-term 

30 relief from the disease. The efficacy of blocking reagents in preventing or alleviating 
autoimmune disorders can be determined using a number of well-characterized animal 
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models of human autoimmune diseases. Examples include murine experimental 
autoimmune encephalitis, systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB 
hybrid mice, murine autoimmune collagen arthritis, diabetes mellitus in NOD mice and 
BB rats, and murine experimental myasthenia gravis (see Paul ed., Fundamental 
5 Immunology, Raven Press, New York, 1989, pp. 840-856), 

Deregulation of an antigen function (e.g., a B lymphocyte antigen function), as a 
means of up regulating immune responses, may also be useful in therapy. Upregulation of 
immune responses may be in the form of enhancing an existing immune response or 
eliciting an initial immune response. For example, enhancing an immune response may 

10 be useful in cases of viral infection, including systemic viral diseases such as influenza, 
the common cold, and encephalitis. 

Alternatively, anti-viral immune responses may be enhanced in an infected patient 
by removing T cells from the patient, costimulating the T cells in vitro with viral 
antigen-pulsed APCs either expressing a peptide of the present invention or together with 

15 a stimulatory form of a soluble peptide of the present invention and reintroducing the in 
vitro activated T cells into the patient. Another method of enhancing anti-viral immune 
responses would be to isolate infected cells from a patient, transfect them with a nucleic 
acid encoding a protein of the present invention as described herein such that the cells 
express all or a portion of the protein on their surface, and reintroduce the transfected 

20 cells into the patient. The infected cells would now be capable of delivering a 
costimulatory signal to, and thereby activate, T cells in vivo. 

A polypeptide of the present invention may provide the necessary stimulation 
signal to T cells to induce a T cell mediated immune response against the transfected 
tumor cells. In addition, tumor cells which lack MHC class I or MHC class II molecules, 

25 or which fail to reexpress sufficient mounts of MHC class I or MHC class II molecules, 
can be transfected with nucleic acid encoding all or a portion of (e.g., a 
cytoplasmic-domain truncated portion) of an MHC class I alpha chain protein and 02 
microglobulin protein or an MHC class II alpha chain protein and an MHC class II beta 
chain protein to thereby express MHC class I or MHC class II proteins on the cell 

30 surface. Expression of the appropriate class I or class II MHC in conjunction with a 

peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-2, B7-3) induces a 

59 



WO 02/081731 PCTYUS02/01222 



T cell mediated immune response against the transfected tumor cell. Optionally, a gene 
encoding an antisense construct which blocks expression of an MHC class II associated 
protein, such as the invariant chain, can also be cotransfected with a DNA encoding a 
peptide having the activity of a B lymphocyte antigen to promote presentation of tumor 
5 associated antigens and induce tumor specific immunity. Thus, the induction of a T cell 
mediated immune response in a human subject may be sufficient to overcome 
tumor-specific tolerance in the subject. 

The activity of a protein of the invention may, among other means, be measured 
by the following methods: 

10 Suitable assays for thymocyte or splenocyte cytotoxicity include, without 

limitation, those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. 
M. Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing 
Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte 
Function 3.1-3.19; Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. 

15 Natl. Acad. Sci. USA 78:2488-2492, 1981; Heirmann et al., J. Immunol. 128:1968-1974, 
1982; Handa et al M J. Immunol 135:1564-1572, 1985; Takai et al., L Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988; Bowman et al., J. 
Virology 61:1992-1998; Bertagnolli et al., Cellular Immunology 133:327-341, 1991; 
Brown et al., J. Immunol. 153:3079-3092, 1994. 

20 Assays for T-cell-dependent immunoglobulin responses and isotype switching 

(which will identify, among others, proteins that modulate T-cell dependent antibody 
responses and that affect Thl/Th2 profiles) include, without limitation, those described 
in: Maliszewski, J. Immunol. 144:3028-3033, 1990; and Assays for B cell Sanction: In 
vitro antibody production, Mond, J. J. and Brunswick, M. In Current Protocols in 

25 Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 3.8,1-3.8.16, John Wiley and Soiis, 
Toronto. 1994. 

Mixed lymphocyte reaction (MLR) assays (which will identify, among others, 
proteins that generate predominantly Thl and CTL responses) include, without limitation, 
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. 
30 Kruisbeek, D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing 

Associates and Wiley-Interscience (Chapter 3, In Vitro assays for Mouse Lymphocyte 

60 



WO 02/081731 



PCT/US02/01222 



Function 3.1-3.19; Chapter 7, Immunologic studies in Humans); Takai et al., J. Immunol. 
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988; Bertagnolli et al., J. 
Immunol. 149:3778-3783, 1992. 

Dendritic cell-dependent assays (which will identify, among others, proteins 
5 expressed by dendritic cells that activate naive T-cells) include, without limitation, those 
described in: Guery et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of 
Experimental Medicine 173:549-559, 1991; Macatoniaet al., Journal of Immunology 
154:5071-5079, 1995; Porgadoret al., Journal of Experimental Medicine 182:255-260, 
1995; Nair et al., Journal of Virology 67:4062-4069, 1993; Huang et al., Science 

10 264:961-965, 1994; Macatonia et al., Journal of Experimental Medicine 169:1255-1264, 
1989; Bhardwaj et al., Journal of Clinical Investigation 94:797-807, 1994; and Inaba et 
al., Journal of Experimental Medicine 172:631-640, 1990. 

Assays for lymphocyte survival/apoptosis (which will identify, among others, 
proteins that prevent apoptosis after superantigen induction and proteins that regulate 

15 lymphocyte homeostasis) include, without limitation, those described in: Darzynkiewicz 
et al., Cytometry 13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; 
Gorczyca et al., Cancer Research 53:1945-1951, 1993; Itoh et al., Cell 66:233-243, 1991 ; 
Zacharchuk, Journal of Immunology 145:4037-4045, 1990; Zamai et al., Cytometry 
14:891-897, 1993; Gorczyca et al.. International Journal of Oncology 1 :639-648, 1992. 

20 Assays for proteins that influence early steps of T-cell commitment and 

development include, without limitation, those described in: Antica et al., Blood 
84:111-117, 1994; Fineet al., Cellular Immunology 155:111-122, 1994; Galy et al., 
Blood 85:2770-2778, 1995; Told et al., Proc. Nat. Acad ScL USA 88:7548-7551, 1991. 

25 4.10.8 ACTIVIN/INHIBIN ACTIVITY 

A polypeptide of the present invention may also exhibit activin- or inhibin-ielated 
activities. A polynucleotide of the invention may encode a polypeptide exhibiting such 
characteristics. Inhibins are characterized by their ability to inhibit the release of follicle 
stimulating hormone (FSH), while activins and are characterized by their ability to 
30 stimulate the release of follicle stimulating hormone (FSH). Thus, a polypeptide of the 
present invention, alone or in heterodimers with a member of the inhibin family, may be 

61 



WO 02/081731 



PCT/US02/01222 



useful as a contraceptive based on the ability of inhibins to decrease fertility in female 
mammals and decrease spermatogenesis in male mammals. Administration of sufficient 
amounts of other inhibins can induce infertility in these mammals. Alternatively, the 
polypeptide of the invention, as a homodimer or as a heterodimer with other protein 
5 subunits of the inhibin group, may be useful as a fertility inducing therapeutic, based 
upon the ability of activin molecules in stimulating PSH release from cells of the anterior 
pituitary. See, for example, U.S. Pat. No. 4,798,885. A polypeptide of the invention may 
also be useful for advancement of the onset of fertility in sexually immature mammals, so 
as to increase the lifetime reproductive performance of domestic animals such as, but not 

10 limited to, cows, sheep and pigs. 

The activity of a polypeptide of the invention may, among other means, be 
measured by the following methods. 

Assays for activin/inhibin activity include, without limitation, those described in: 
Vale et al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986; Vale 

15 et al., Nature 321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al., 
Proc. Nati. Acad. Sci. USA 83:3091-3095, 1986. 

4.10.9 CHEMOTACTIOCHEMOKINETIC ACTIVITY 

A polypeptide of the present invention may be involved in chemotactic or 
20 chemokinetic activity for mammalian cells, including, for example, monocytes, 

fibroblasts, neutrophils, T-cells, mast cells, eosinophils, epithelial and/or endothelial 
cells. A polynucleotide of the invention can encode a polypeptide exhibiting such 
attributes. Chemotactic and chemokinetic receptor activation can be used to mobilize or 
attract a desired cell population to a desired site of action. Chemotactic or chemokinetic 
25 compositions (e.g. proteins, antibodies, binding partners, or modulators of the invention) 
provide particular advantages in treatment of wounds and other trauma to tissues, as well 
as in treatment of localized infections. For example, attraction of lymphocytes, 
monocytes or neutrophils to tumors or sites of infection may result in improved immune 
responses against the tumor or infecting agent. 
30 A protein or peptide has chemotactic activity for a particular cell population if it 

can stimulate, directly or indirectly, the directed orientation or movement of such cell 
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population. Preferably, the protein or peptide has the ability to directly stimulate directed 
movement of cells. Whether a particular protein has chemotactic activity for a population 
of cells can be readily determined by employing such protein or peptide in any known 
assay for cell chemotaxis. 
5 Therapeutic compositions of the invention can be used in the following: 

Assays for chemotactic activity (which will identify proteins that induce or 
prevent chemotaxis) consist of assays that measure the ability of a protein to induce the 
migration of cells across a membrane as well as the ability of a protein to induce the 
adhesion of one cell population to another cell population. Suitable assays for movement 

10 and adhesion include, without limitation, those described in: Current Protocols in 

Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. 
Strober, Pub. Greene Publishing Associates and Wiley-Interscience (Chapter 6.12, 
Measurement of alpha and beta Chemokines 6.12.1-6.12.28; Taub et al. J. Clin. Invest. 
95:1370-1376, 1995; lind et al. APMIS 103:140-146, 1995; Muller et al Eur. J. 

15 Immunol. 25:1744-1748; Gruber et al. J. of Immunol. 152:5860-5867, 1994; Johnston et 
al. J. of Immunol. 153:1762-1768, 1994. 

4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY 

A polypeptide of the invention may also be involved in hemostatis or 
20 thrombolysis or thrombosis. A polynucleotide of the invention can encode a polypeptide 
exhibiting such attributes. Compositions may be useful in treatment of various 
coagulation disorders (including hereditary disorders, such as hemophilias) or to enhance 
coagulation and other hemostatic events in treating wounds resulting from trauma, 
surgery or other causes. A composition of the invention may also be useful for dissolving 
25 or inhibiting formation of thromboses and for treatment and prevention of conditions 
resulting therefrom (such as, for example, infarction of cardiac and central nervous 
system vessels (e.g., stroke). 

Therapeutic compositions of the invention can be used in the following: 
Assay for hemostatic and thrombolytic activity include, without limitation, those 
30 described in: linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et al., 
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Thrombosis Res. 45:413-419, 1987; Humphrey et aL, Fibrinolysis 5:71-79 (1991); 
Schaub, Prostaglandins 35:467-474, 1988. 

4.10.11 CANCER DIAGNOSIS AND THERAPY 

5 Polypeptides of the invention may be involved in cancer cell generation, 

proliferation or metastasis. Detection of the presence or amount of polynucleotides or 
polypeptides of the invention may be useful for the diagnosis and/or prognosis of one or 
more types of cancer. For example, the presence or increased expression of a 
polynucleotide/polypeptide of the invention may indicate a hereditary risk of cancer, a 

10 precancerous condition, or an ongoing malignancy. Conversely, a defect in the gene or 
absence of the polypeptide may be associated with a cancer condition. Identification of 
single nucleotide polymorphisms associated with cancer or a predisposition to cancer 
may also be useful for diagnosis or prognosis. 

Cancer treatments promote tumor regression by inhibiting tumor cell 

15 proliferation, inhibiting angiogenesis (growth of new blood vessels that is necessary to 
support tumor growth) and/or prohibiting metastasis by reducing tumor cell motility or 
invasiveness. Therapeutic compositions of the invention may be effective in adult and 
pediatric oncology including in solid phase tumors/malignancies, locally advanced 
tumors, human soft tissue sarcomas, metastatic cancer, including lymphatic metastases, 

20 blood cell malignancies including multiple myeloma, acute and chronic leukemias, and 
lymphomas, head and neck cancers including mouth cancer, larynx cancer and thyroid 
cancer, lung cancers including small cell carcinoma and non-small cell cancers, breast 
cancers including small cell carcinoma and ductal carcinoma, gastrointestinal cancers 
including esophageal cancer, stomach cancer, colon cancer, colorectal cancer and polyps 

25 associated with colorectal neoplasia, pancreatic cancers, liver cancer, urologic cancers 
including bladder cancer and prostate cancer, malignancies of the female genital tract 
including ovarian carcinoma, uterine (including endometrial) cancers, and solid tumor in 
the ovarian follicle, kidney cancers including renal cell carcinoma, brain cancers 
including intrinsic brain tumors, neuroblastoma, astrocytic brain tumors, gliomas, 

30 metastatic tumor cell invasion in the central nervous system, bone cancers including 

osteomas, skin cancers including malignant melanoma, tumor progression of human skin 
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keratinocytes, squamous cell carcinoma, basal cell carcinoma, hemangiopericytoma and 
Kaiposi's sarcoma. 

Polypeptides, polynucleotides, or modulators of polypeptides of the invention 
(including inhibitors and stimulators of the biological activity of the polypeptide of the 
5 invention) may be administered to treat cancer. Therapeutic compositions can be 

administered in therapeutically effective dosages alone or in combination with adjuvant 
cancer therapy such as surgery, chemotherapy, radiotherapy, thermotherapy, and laser 
therapy, and may provide a beneficial effect, e.g. reducing tumor size, slowing rate of 
tumor growth, inhibiting metastasis, or otherwise improving overall clinical condition, 

10 without necessarily eradicating the cancer. 

The composition can also be administered in therapeutically effective amounts as 
a portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of the 
polypeptide or modulator of the invention with one or more anti-cancer drugs in addition 
to a pharmaceutically acceptable carrier for delivery. The use of anti-cancer cocktails as 

15 a cancer treatment is routine. Anti-cancer drugs that are well known in the art and can be 
used as a treatment in combination with the polypeptide or modulator of the invention 
include: Actinomycin D, Aminoglutethimide, Asparaginase, Bleomycin, Busulfan, 
Carboplatin, Carmustine, Chlorambucil, Cisplatin (cis-DDP), Cyclophosphamide, 
Cytarabine HC1 (Cytosine arabinoside), Dacarbazine, Dactinomycin, Daunorubicin HC1, 

20 Doxorubicin HC1, Estramustine phosphate sodium, Etoposide (V16-213), Floxuridine, 5- 
Fluorouracil (5-Fu), Flutamide, Hydroxyurea (hydroxycarbamide), Ifosfamide, Interferon 
Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing factor analog), 
Lomustine, Mechlorethamine HG1 (nitrogen mustard), Melphalan, Mercaptopurine, 
Mesna, Methotrexate (MTX), Mitomycin, Mitoxantrone HC1, Octreotide, Plicamycin, 

25 Procarbazine HC1, Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine 
sulfate, Vincristine sulfate, Amsacrine, Azacitidine, Hexamethylmelamine, Interleukin-2, 
Mitoguazone, Pentostatin, Semustine, Teniposide, and Vindesine sulfate. 

In addition, therapeutic compositions of the invention may be used for 
prophylactic treatment of cancer. There are hereditary conditions and/or environmental 

30 situations (e.g. exposure to carcinogens) known in the art that predispose an individual to 
developing cancers. Under these circumstances, it may be beneficial to treat these 
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individuals with therapeutically effective doses of the polypeptide of the invention to 
reduce the risk of developing cancers. 

In vitro models can be used to determine the effective doses of the polypeptide of 
the invention as a potential cancer treatment. These in vitro models include proliferation 
5 assays of cultured tumor cells, growth of cultured tumor cells in soft agar (see Fieshney, 
(1987) Culture of Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, 
NY Ch 18 and Ch 21), tumor systems in nude mice as described in Giovanella et al., J. 
Nad. Can. Inst., 52: 921-30 (1974), mobility and invasive potential of tumor cells in 
Boyden Chamber assays as described in Pilkington et al., Anticancer Res., 17: 4107-9 
10 (1997), and angiogenesis assays such as induction of vascularization of the chick 

chorioallantoic membrane or induction of vascular endothelial cell migration as described 
in Ribatta et al., Intl. J. Dev. Biol., 40: 1189-97 (1999) andli et al., Clin. Exp. 
Metastasis, 17:423-9 (1999), respectively. Suitable tumor cells lines are available, e.g. 
from American Type Tissue Culture Collection catalogs. 

15 

4.10.12 RECEPTOR/LIGAND ACTIVITY 

A polypeptide of the present invention may also demonstrate activity as receptor, 
receptor ligand or inhibitor or agonist of receptor/ligand interactions. A polynucleotide 
of the invention can encode a polypeptide exhibiting such characteristics. Examples of 

20 such receptors and ligands include, without limitation, cytokine receptors and their 
ligands, receptor kinases and their ligands, receptor phosphatases and their ligands, 
receptors involved in cell-cell interactions and their ligands (including without limitation, 
cellular adhesion molecules (such as selectins, integrins and their ligands) and 
receptor/ligand pairs involved in antigen presentation, antigen recognition and 

25 development of cellular and humoral immune responses. Receptors and ligands are also 
useful for screening of potential peptide or small molecule inhibitors of the relevant 
receptor/ligand interaction. A protein of the present invention (including, without 
limitation, fragments of receptors and ligands) may themselves be useful as inhibitors of 
receptor/ligand interactions. 

30 The activity of a polypeptide of the invention may, among other means, be 

measured by the following methods: 
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Suitable assays for receptor-ligand activity include without limitation those 
described in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, 
D. H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and 
Wiley- Interscience (Chapter 7.28, Measurement of Cellular Adhesion under static 
5 conditions 7.28.1- 7.28.22), Takai et al., Proc. Natl. Acad. Sci. USA 84:6864-6868, 1987; 
Bierer et al., J. Exp. Med. 168: 1 145-1 156, 1988; Rosenstein et al., J. Exp. Med. 
169:149-160 1989; Stoltenborg et al., J. Immunol. Methods 175:59-68, 1994; Stitt et al., 
Cell 80:661-670, 1995. 

By way of example, the polypeptides of the invention may be used as a receptor 

10 for a ligand(s) thereby transmitting the biological activity of that ligand(s). Iigands may 
be identified through binding assays, affinity chromatography, dihybrid screening assays, 
BIAcore assays, gel overlay assays, or other methods known in the art. 

Studies characterizing drugs or proteins as agonist or antagonist or partial agonists 
or a partial antagonist require the use of other proteins as competing Iigands. Hie 

15 polypeptides of the present invention or ligand(s) thereof may be labeled by being 

coupled to radioisotopes, colorimetric molecules or a toxin molecules by conventional 
methods. ("Guide to Protein Purification" Murray P. Deutscher (ed) Methods in 
Enzymology Vol. 182 (1990) Academic Press, Inc. San Diego). Examples of 
radioisotopes include, but are not limited to, tritium and carbon-14 . Examples of 

20 colorimetric molecules include, but are not limited to, fluorescent molecules such as 
fluorescamine, or rhodamine or other colorimetric molecules. Examples of toxins 
include, but are not limited, to ricin. 

4.10.13 DRUG SCREENING 

25 This invention is particularly useful for screening chemical compounds by using 

the novel polypeptides or binding fragments thereof in any of a variety of drug screening 
techniques. The polypeptides or fragments employed in such a test may either be free in 
solution, affixed to a solid support, borne on a cell surface or located intracellularly. One 
method of drug screening utilizes eukaryotic or prokaryotic host cells which are stably 

30 transformed with recombinant nucleic acids expressing the polypeptide or a fragment 
thereof. Drugs are screened against such transformed cells in competitive binding assays. 

67 



WO 02/081731 



PCT/US02/01222 



Such cells, either in viable or fixed form, can be used for standard binding assays. One 
may measure, for example, the formation of complexes between polypeptides of the 
invention or fragments and the agent being tested or examine the diminution in complex 
formation between the novel polypeptides and an appropriate cell line, which are well 
5 known in the art. 

Sources for test compounds that may be screened for ability to bind to or 
modulate (i.e., increase or decrease) the activity of polypeptides of the invention include 
(1) inorganic and organic chemical libraries, (2) natural product libraries, and (3) 
combinatorial libraries comprised of either random or mimetic peptides, oligonucleotides 

10 or organic molecules. 

Chemical libraries may be readily synthesized or purchased from a number of 
commercial sources, and may include structural analogs of known compounds or 
compounds that are identified as "hits" or "leads" via natural product screening. 

The sources of natural product libraries are microorganisms (including bacteria 

15 and fungi), animals, plants or other vegetation, or marine organisms, and libraries of 

mixtures for screening may be created by: (1) fermentation and extraction of broths from 
soil, plant or marine microorganisms or (2) extraction of the organisms themselves. 
Natural product libraries include polyketides, non-ribosomal peptides, and (non-naturally 
occurring) variants thereof. For a review, see Science 282:63-6% (1998). 

20 Combinatorial libraries are composed of large numbers of peptides, 

oligonucleotides or organic compounds and can be readily prepared by traditional 
automated synthesis methods, PCR, cloning or proprietary synthetic methods. Of 
particular interest are peptide and oligonucleotide combinatorial libraries. Still other 
libraries of interest include peptide, protein, peptidomimetic, multiparallel synthetic 

25 collection, recombinatorial, and polypeptide libraries. For a review of combinatorial 
chemistry and libraries created therefrom, see Myers, Curr. Opin. Biotechnol 8:701-707 
(1997). For reviews and examples of peptidomimetic libraries, see Al-Obeidi et al., Mot 
Biotechnol 9(3):205-23 (1998); Hrubyetal., CurrOpin ChemBiol, 1(1): 114-19 (1997); 
Dorner et al., Bioorg Med Chem, 4(5):709-15 (1996) (alkylated dipeptides). 

30 Identification of modulators through use of the various libraries described herein 

permits modification of the candidate "hit" (or "lead") to optimize the capacity of the 
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"hit" to bind a polypeptide of the invention. The molecules identified in the binding assay 
are then tested for antagonist or agonist activity in in vivo tissue culture or animal models 
that are well known in the art. In brief, the molecules are titrated into a plurality of cell 
cultures or animals and then tested for either cell/animal death or prolonged survival of 

5 the animal/cells. 

The binding molecules thus identified may be complexed with toxins, e.g., ricin 
or cholera, or with other compounds that are toxic to cells such as radioisotopes. The 
toxin-binding molecule complex is then targeted to a tumor or other cell by the specificity 
of the binding molecule for a polypeptide of the invention. Alternatively, the binding 

10 molecules may be complexed with imaging agents for targeting and imaging purposes. 

410.14 ASSAY FOR RECEPTOR ACTIVITY 

The invention also provides methods to detect specific binding of a polypeptide 
e.g. a ligand or a receptor. The art provides numerous assays particularly useful for 

15 identifying previously unknown binding partners for receptor polypeptides of the 
invention. For example, expression cloning using mammalian or bacterial cells, or 
dihybrid screening assays can be used to identify polynucleotides encoding binding 
partners. As another example, affinity chromatography with the appropriate immobilized 
polypeptide of the invention can be used to isolate polypeptides that recognize and bind 

20 polypeptides of the invention. There are a number of different libraries used for the 
identification of compounds, and in particular small molecules, that modulate (i.e., 
increase or decrease) biological activity of a polypeptide of the invention, ligands for 
receptor polypeptides of the invention can also be identified by adding exogenous 
ligands, or cocktails of ligands to two cells populations that are genetically identical 

25 except for the expression of the receptor of the invention: one cell population expresses 
the receptor of the invention whereas the other does not The response of the two cell 
populations to the addition of ligands(s) are then compared. Alternatively, an expression 
library can be co-expressed with the polypeptide of the invention in cells and assayed for 
an autocrine response to identify potential ligand(s). As still another example, BIAcore 

30 assays, gel overlay assays, or other methods known in the art can be used to identify 

binding partner polypeptides, including, (1) organic and inorganic chemical libraries, (2) 
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natural product libraries, and (3) combinatorial libraries comprised of random peptides, 
oligonucleotides or organic molecules. 

The role of downstream intracellular signaling molecules in the signaling cascade 
of the polypeptide of the invention can be determined For example, a chimeric protein in 
5 which the cytoplasmic domain of the polypeptide of the invention is fused to the 

extracellular portion of a protein, whose ligand has been identified, is produced in a host 
cell. The cell is then incubated with the ligand specific for the extracellular portion of the 
chimeric protein, thereby activating the chimeric receptor. Known downstream proteins 
involved in intracellular signaling can then be assayed for expected modifications i.e. 
10 phosphorylation. Other methods known to those in the art can also be used to identify 
signaling molecules involved in receptor activity. 

4.10.15 ANTI-INFLAMMATORY ACTIVITY 

Compositions of the present invention may also exhibit anti-inflammatory 

15 activity. The anti-inflammatory activity may be achieved by providing a stimulus to cells 
involved in the inflammatory response, by inhibiting or promoting cell-cell interactions 
(such as, for example, cell adhesion), by inhibiting or promoting chemotaxis of cells 
involved in the inflammatory process, inhibiting or promoting cell extravasation, or by 
stimulating or suppressing production of other factors which more directly inhibit or 

20 promote an inflammatory response. Compositions with such activities can be used to treat 
inflammatory conditions including chronic or acute conditions), including without 
limitation intimation associated with infection (such as septic shock, sepsis or systemic 
inflammatory response syndrome (SIRS)), ischemia-reperfusion injury, endotoxin 
lethality, arthritis, complement-mediated hyperacute rejection, nephritis, cytokine or 

25 chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or resulting 
from over production of cytokines such as TNP or DLrl. Compositions of the invention 
may also be useful to treat anaphylaxis and hypersensitivity to an antigenic substance or 
material. Compositions of this invention may be utilized to prevent or treat conditions 
such as, but not limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced 

30 shock, rheumatoid arthritis, chronic inflammatory arthritis, pancreatic cell damage from 
diabetes mellitus type 1, graft versus host disease, inflammatory bowel disease, 
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inflamation associated with pulmonary disease, other autoimmune disease or 
inflammatory disease, an antiproliferative agent such as for acute or chronic mylegenous 
leukemia or in the prevention of premature labor secondary to intrauterine infections. 

5 4.10.16 LEUKEMIAS 

Leukemias and related disorders may be treated or prevented by administration of 
a therapeutic that promotes or inhibits function of the polynucleotides and/or 
polypeptides of the invention. Such leukemias and related disorders include but are not 
limited to acute leukemia, acute lymphocytic leukemia, acute myelocytic leukemia, 
10 myeloblasts, promyelocyte, myelomonocytic, monocytic, erythroleukemia, chronic 

leukemia, chronic myelocytic (granulocytic) leukemia and chronic lymphocytic leukemia 
(for a review of such disorders, see Fishman et al., 1985, Medicine, 2d Ed., J.B. 
lippincott Co., Philadelphia). 

15 4.10.17 NERVOUS SYSTEM DISORDERS 

Nervous system disorders, involving cell types which can be tested for efficacy of 
intervention with compounds that modulate the activity of the polynucleotides and/or 
polypeptides of the invention, and which can be treated upon thus observing an indication 
of therapeutic utility, include but are not limited to nervous system injuries, and diseases 

20 or disorders which result in either a disconnection of axons, a diminution or degeneration 
of neurons, or demyelination. Nervous system lesions which may be treated in a patient 
(including human and non-human mammalian patients) according to the invention 
include but are not limited to the following lesions of either the central (including spinal 
cord, brain) or peripheral nervous systems: 

25 (i) traumatic lesions, including lesions caused by physical injury or associated 

with surgery, for example, lesions which sever a portion of the nervous system, or 
compression injuries; 

(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous 
system results in neuronal injury or death, including cerebral infarction or ischemia, or 

30 spinal cord infarction or ischemia; 
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(iii) infectious lesions, in which a portion of the nervous system is destroyed or 
injured as a result of infection, for example, by an abscess or associated with infection by 
human immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme 
disease, tuberculosis, syphilis; 
5 (iv) degenerative lesions, in which a portion of the nervous system is destroyed 

or injured as a result of a degenerative process including but not limited to degeneration 
associated with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or 
amyotrophic lateral sclerosis; 

(v) lesions associated with nutritional diseases or disorders, in which a portion 
10 of the nervous system is destroyed or injured by a nutritional disorder or disorder of 

metabolism including but not limited to, vitamin B12 deficiency, folic acid deficiency, 
Wernicke disease, tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary 
degeneration of the corpus callosum), and alcoholic cerebellar degeneration; 

(vi) neurological lesions associated with systemic diseases including but not 
15 limited to diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus, 

carcinoma, or sarcoidosis; 

(vii) lesions caused by toxic substances including alcohol, lead, or particular 
neurotoxins; and 

(viii) demyelinated lesions in which a portion of the nervous system is 

20 destroyed or injured by a demyelinating disease including but not limited to multiple 
sclerosis, human immunodeficiency virus-associated myelopathy, transverse myelopathy 
or various etiologies, progressive multifocal leukoencephalopathy, and central pontine 
myelinolysis. 

Therapeutics which are useful according to the invention for treatment of a 
25 nervous system disorder may be selected by testing for biological activity in promoting 
the survival or differentiation of neurons. For example, and not by way of limitation, 
therapeutics which elicit any of the following effects may be useful according to the 
invention: 

(i) increased survival time of neurons in culture; 
30 (ii) increased sprouting of neurons in culture or in vivo; 
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(iii) increased production of a neuron-associated molecule in culture or in vivo, 
e.g., choline acetyltransferase or acetylcholinesterase with respect to motor neurons; or 

(iv) decreased symptoms of neuron dysfunction in vivo. 

Such effects may be measured by any method known in the art. In preferred, 
5 non-limiting embodiments, increased survival of neurons may be measured by the 

method set forth in Arakawa et al. (1990, J. Neurosci. 10:3507-3515); increased sprouting 
of neurons may be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 
70:65-82) or Brown et al. (1981, Ann. Rev. Neurosci. 4: 17-42); increased production of 
neuron-associated molecules may be measured by bioassay, enzymatic assay, antibody 
10 binding, Northern blot assay, etc., depending on the molecule to be measured; and motor 
neuron dysfunction may be measured by assessing the physical manifestation of motor 
neuron disorder, e.g., weakness, motor neuron conduction velocity, or functional 
disability. 

In specific embodiments, motor neuron disorders that may be treated according to 
15 the invention include but are not limited to disorders such as infarction, infection, 

exposure to toxin, trauma, surgical damage, degenerative disease or malignancy that may 
affect motor neurons as well as other components of the nervous system, as well as 
disorders that selectively affect neurons such as amyotrophic lateral sclerosis, and 
including but not limited to progressive spinal muscular atrophy, progressive bulbar 
20 palsy, primary lateral sclerosis, infantile and juvenile muscular atrophy, progressive 
bulbar paralysis of childhood (Fazio-Londe syndrome), poliomyelitis and the post polio 
syndrome, and Hereditary Motorsensory Neuropathy (Charcot-Marie-Tooth Disease). 

4J0.18 OTHER ACTIVITIES 

25 A polypeptide of the invention may also exhibit one or more of the following 

additional activities or effects: inhibiting the growth, infection or function of, or killing, 
infectious agents, including, without limitation, bacteria, viruses, fungi and other 
parasites; effecting (suppressing or enhancing) bodily characteristics, including, without 
limitation, height, weight, hair color, eye color, skin, fat to lean ratio or other tissue 

30 pigmentation, or organ or body part size or shape (such as, for example, breast 

augmentation or diminution, change in bone form or shape); effecting biorhythms or 
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circadian cycles or rhythms; effecting the fertility of male or female subjects; effecting 
the metabolism, catabolism, anabolism, processing, utilization, storage or elimination of 
dietary fat, lipid, protein, carbohydrate, vitamins, minerals, co-factors or other nutritional 
factors or component(s); effecting behavioral characteristics, including, without 
5 limitation, appetite, libido, stress, cognition (including cognitive disorders), depression 
(including depressive disorders) and violent behaviors; providing analgesic effects or 
other pain reducing effects; promoting differentiation and growth of embryonic stem cells 
in lineages other than hematopoietic lineages; hormonal or endocrine activity; in the case 
of enzymes, con-ecting deficiencies of the enzyme and treating deficiency-related 
10 diseases; treatment of hyperproliferative disorders (such as, for example, psoriasis); 
immunoglobulin-like activity (such as, for example, the ability to bind antigens or 
complement); and the ability to act as an antigen in a vaccine composition to raise an 
immune response against such protein or another material or entity which is 
cross-reactive with such protein. 

15 

4.10.19 IDENTIFICATION OF POLYMORPHISMS 

The demonstration of polymorphisms makes possible the identification of such 
polymoiphisms in human subjects and the pharmacogenetic use of this information for 
diagnosis and treatment. Such polymorphisms may be associated with, e.g., differential 

20 predisposition or susceptibility to various disease states (such as disorders involving 

inflammation or immune response) or a differential response to drug administration, and 
this genetic information can be used to tailor preventive or therapeutic treatment 
appropriately. For example, the existence of a polymorphism associated with a 
predisposition to inflammation or autoimmune disease makes possible the diagnosis of 

25 this condition in humans by identifying the presence of the polymorphism. 

Polymorphisms can be identified in a variety of ways known in the art which all 
generally involve obtaining a sample from a patient, analyzing DNA from the sample, 
optionally involving isolation or amplification of the DNA, and identifying the presence 
of the polymorphism in the DNA. For example, PCR may be used to amplify an 

30 appropriate fragment of genomic DNA which may then be sequenced. Alternatively, the 
DNA may be subjected to allele-speeific oligonucleotide hybridization (in which 

74 



WO 02/081731 



PCT/US02/01222 



appropriate oligonucleotides are hybridized to the DNA under conditions permitting 
detection of a single base mismatch) or to a single nucleotide extension assay (in which 
an oligonucleotide that hybridizes immediately adjacent to the position of the 
polymorphism is extended with one or more labeled nucleotides). In addition, traditional 
5 restriction fragment length polymorphism analysis (using restriction enzymes that 

provide differential digestion of the genomic DNA depending on the presence or absence 
of the polymorphism) may be performed Arrays with nucleotide sequences of the 
present invention can be used to detect polymoiphisms. The array can comprise modified 
nucleotide sequences of the present invention in order to detect the nucleotide sequences 
10 of the present invention. In the alternative, any one of the nucleotide sequences of the 
present invention can be placed on the array to detect changes from those sequences. 

Alternatively a polymorphism resulting in a change in the amino acid sequence 
could also be detected by detecting a corresponding change in amino acid sequence of the 
protein, e.g., by an antibody specific to the variant sequence. 

15 

4.10.20 ARTHRITIS AND INFLAMMATION 

The immunosuppressive effects of the compositions of the invention against 
rheumatoid arthritis is determined in an experimental animal model system. The 
experimental model system is adjuvant induced arthritis in rats, and the protocol is 

20 described by J- Holoshitz, et at., 1983, Science, 219:56, or by B. Waksman et al., 1963, 
Int. Arch. Allergy Appl. Immunol., 23:129. Induction of the disease can be caused by a 
single injection, generally intradermally, of a suspension of killed Mycobacterium 
tuberculosis in complete Freund ! s adjuvant (CFA). The route of injection can vary, but 
rats may be injected at the base of the tail with an adjuvant mixture. The polypeptide is 

25 administered in phosphate buffered solution (PBS) at a dose of about 1-5 mg/kg. The 
control consists of administering PBS only. 

The procedure for testing the effects of the test compound would consist of 
intradermally injecting killed Mycobacterium tuberculosis in CFA followed by 
immediately administering the test compound and subsequent treatment every other day 

30 until day 24. At 14, 15, 18, 20, 22, and 24 days after injection of Mycobacterium CFA, an 
overall arthritis score may be obtained as described by J. Holoskitz above. An analysis of 
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the data would reveal that the test compound would have a dramatic affect on the 
swelling of the joints as measured by a decrease of the arthritis score. 

5 4.11 THERAPEUTIC METHODS 

The compositions (including polypeptide fragments, analogs, variants and 
antibodies or other binding partners or modulators including antisense polynucleotides) 
of the invention have numerous applications in a variety of therapeutic methods. 
Examples of therapeutic applications include, but are not limited to, those exemplified 
10 herein. 

4.11.1 EXAMPLE 

One embodiment of the invention is the administration of an effective amount of 
the polypeptides or other composition of the invention to individuals affected by a 

15 disease or disorder that can be modulated by regulating the peptides of the invention. 
While the mode of administration is not particularly important, parenteral administration 
is preferred. *An exemplary mode of administration is to deliver an intravenous bolus. 
The dosage of the polypeptides or other composition of the invention will normally be 
determined by the prescribing physician. It is to be expected that the dosage will vary 

20 according to the age, weight, condition and response of the individual patient. Typically, 
the amount of polypeptide administered per dose will be in the range of about O.Oljxg/kg 
to 100 mg/kg of body weight, with the preferred dose being about O.lfxg&g to 10 mg/kg 
of patient body weight. For parenteral administration, polypeptides of the invention will 
be formulated in an injectable form combined with a pharmaceutical^ acceptable 

25 parenteral vehicle. Such vehicles are well known in the art and examples include water, 
saline, Ringer's solution, dextrose solution, and solutions consisting of small amounts of 
the human serum albumin. The vehicle may contain minor amounts of additives that 
maintain the isotonicity and stability of the polypeptide or other active ingredient The 
preparation of such solutions is within the skill of the art. 

30 
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4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF 
ADMINISTRATION 

A protein or other composition of the present invention (from whatever source 
derived, including without limitation from recombinant and non-recombinant sources and 
5 including antibodies and other binding partners of the polypeptides of the invention) may 
be administered to a patient in need, by itself, or in pharmaceutical compositions where it 
is mixed with suitable earners or excipient(s) at doses to treat or ameliorate a variety of 
disorders. Such a composition may optionally contain (in addition to protein or other 
active ingredient and a carrier) diluents, fillers, salts, buffers, stabilizers, solubilizers, and 

10 other materials well known in the art. The term "pharmaceutical^ acceptable" means a 
non-toxic material that does not interfere with the effectiveness of the biological activity 
of the active ingredient(s). The characteristics of the carrier will depend on the route of 
administration. The pharmaceutical composition of the invention may also contain 
cytokines, lymphokines, or other hematopoietic factors such as M-CSF, GM-CSF, TNF, 

15 IL-1, 1L-2, IL-3, IL-4, IL-5, EL-6, IL-7, IL-8, IL-9, IL-10, IL-11, IL-12, DL-13, IL-14, 
JL-15, IEN, TNFO, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin, stem ceU factor, 
and erythropoietin. In further compositions, proteins of the invention may be combined 
with other agents beneficial to the treatment of the disease or disorder in question. These 
agents include various growth factors such as epidermal growth factor (EGF), 

20 platelet-derived growth factor (PDGF), transforming growth factors (TGF-<x and TGF-0), 
insulin-like growth factor (IGF), as well as cytokines described herein. 

The pharmaceutical composition may further contain other agents which either 
enhance the activity of the protein or other active ingredient or complement its activity or 
use in treatment. Such additional factors and/or agents may be included in the 

25 pharmaceutical composition to produce a synergistic effect with protein or other active 
ingredient of the invention, or to minimize side effects. Conversely, protein or other 
active ingredient of the present invention may be included in formulations of the 
particular clotting factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic 
or anti-thrombotic factor, or anti- inflammatory agent to minimize side effects of the 

30 clotting factor, cytokine, lymphokine, other hematopoietic factor, thrombolytic or 

anti-thrombotic factor, or anti-inflammatory agent (such as IL-IRa, IL-1 Hyl, IL-1 Hy2, 
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anti-TNF, corticosteroids, immunosuppressive agents). A protein of the present 
invention may be active in multimers (e.g., heterodimers or homodimers) or complexes 
with itself or other proteins. As a result, pharmaceutical compositions of the invention 
may comprise a protein of the invention in such multimeric or complexed form. 
5 As an alternative to being included in a pharmaceutical composition of the 

invention including a first protein, a second protein or a therapeutic agent may be 
concurrendy administered with the first protein (e.g., at the same time, or at differing 
times provided that therapeutic concentrations of the combination of agents is achieved at 
the treatment site). Techniques for formulation and administration of the compounds of 

10 the instant application may be found in "Remington's Pharmaceutical Sciences," Mack 
Publishing Co., Easton, PA, latest edition. A therapeutically effective dose further refers 
to that amount of the compound sufficient to result in amelioration of symptoms, e.g., 
treatment, healing, prevention or amelioration of the relevant medical condition, or an 
increase in rate of treatment, healing, prevention or amelioration of such conditions. 

15 When applied to an individual active ingredient, administered alone, a therapeutically 
effective dose refers to that ingredient alone. When applied to a combination, a 
therapeutically effective dose refers to combined amounts of the active ingredients that 
result in the therapeutic effect, whether administered in combination, serially or 
simultaneously. 

20 In practicing the method of treatment or use of the present invention, a 

therapeutically effective amount of protein or other active ingredient of the present 
invention is administered to a mammal having a condition to be treated. Protein or other 
active ingredient of the present invention may be administered in accordance with the 
method of the invention either alone or in combination with other therapies such as 

25 treatments employing cytokines, lymphokines or other hematopoietic factors. When co- 
administered with one or more cytokines, lymphokines or other hematopoietic factors, 
protein or other active ingredient of the present invention may be administered either 
simultaneously with the cytokine(s), Iymphokine(s), other hematopoietic factor(s), 
thrombolytic or anti-thrombotic factors, or sequentially. If administered sequentially, the 

30 attending physician will decide on the appropriate sequence of administering protein or 
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other active ingredient of the present invention in combination with cytokine(s), 
lymphokine(s), other hematopoietic factor(s), thrombolytic or anti-thrombotic factors. 

4.12.1 ROUTES OF ADMINISTRATION 

5 Suitable routes of administration may, for example, include oral, rectal, 

transmucosal, or intestinal administration; parenteral delivery, including intramuscular, 
subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, 
intravenous, intraperitoneal, intranasal, or intraocular injections. Administration of 
protein or other active ingredient of the present invention used in the pharmaceutical 

10 composition or to practice the method of the present invention can be carried out in a 
variety of conventional ways, such as oral ingestion, inhalation, topical application or 
cutaneous, subcutaneous, intraperitoneal, parenteral or intravenous injection. Intravenous 
administration to the patient is preferred. 

Alternately, one may administer the compound in a local rather than systemic 

15 manner, for example, via injection of the compound directly into a arthritic joints or in 
fibrotic tissue, often in a depot or sustained release formulation. In order to prevent the 
scarring process frequently occurring as complication of glaucoma surgery, the 
compounds may be administered topically, for example, as eye drops. Furthermore, one 
may administer the drug in a targeted drug delivery system, for example, in a liposome 

20 coated with a specific antibody, targeting, for example, arthritic or fibrotic tissue. The 
liposomes will be targeted to and taken up selectively by the afflicted tissue. 

The polypeptides of the invention are administered by any route that delivers an 
effective dosage to the desired site of action. The determination of a suitable route of 
administration and an effective dosage for a particular indication is within the level of 

25 skill in the art. Preferably for wound treatment, one administers the therapeutic 
compound directly to the site. Suitable dosage ranges for the polypeptides of the 
invention can be extrapolated from these dosages or from similar studies in appropriate 
animal models. Dosages can then be adjusted as necessary by the clinician to provide 
maximal therapeutic benefit. 

30 

4.12.2 COMPOSITIONS/FORMULATIONS 
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Pharmaceutical compositions for use in. accordance with the present invention 
thus may be formulated in a conventional manner using one or more physiologically 
acceptable carriers comprising excipients and auxiliaries which facilitate processing of 
the active compounds into preparations which can be used pharmaceutical^ These 
5 pharmaceutical compositions may be manufactured in a manner that is itself known, e.g. , 
by means of conventional mixing, dissolving, granulating, dragee-making, levigating, 
emulsifying, encapsulating, entrapping or lyophilizing processes. Proper formulation is 
dependent upon the route of administration chosen. When a therapeutically effective 
amount of protein or other active ingredient of the present invention is administered 

10 orally, protein or other active ingredient of the present invention will be in the form of a 
tablet, capsule, powder, solution or elixir. When administered in tablet form, the 
pharmaceutical composition of the invention may additionally contain a solid carrier such 
as a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5 to 95% 
protein or other active ingredient of the present invention, and preferably from about 25 

15 to 90% protein or other active ingredient of the present invention. When administered in 
liquid form, a liquid carrier such as water, petroleum, oils of animal or plant origin such 
as peanut oil, mineral oil, soybean oil, or sesame oil, or synthetic oils may be added. The 
liquid form of the pharmaceutical composition may further contain physiological saline 
solution, dextrose or other saccharide solution, or glycols such as ethylene glycol, 

20 propylene glycol or polyethylene glycol. When administered in liquid form, the 

pharmaceutical composition contains from about 0.5 to 90% by weight of protein or other 
active ingredient of the present invention, and preferably from about 1 to 50% protein or 
other active ingredient of the present invention. 

When a therapeutically effective amount of protein or other active ingredient of 

25 the present invention is administered by intravenous, cutaneous or subcutaneous 

injection, protein or other active ingredient of the present invention will be in the form of 
a pyrogen-free, parenterally acceptable aqueous solution. The preparation of such 
parenterally acceptable protein or other active ingredient solutions, having due regard to 
pH, isotonicity, stability, and the like, is within the skill in the art. A preferred 

30 pharmaceutical composition for intravenous, cutaneous, or subcutaneous injection should 
contain, in addition to protein or other active ingredient of the present invention, an 
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isotonic vehicle such as Sodium Chloride Injection, Ringer's Injection, Dextrose 
Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's Injection, or other 
vehicle as known in the art. The pharmaceutical composition of the present invention 
may also contain stabilizers, preservatives, buffers, antioxidants, or other additives 
5 known to those of skill in the art. For injection, the agents of the invention may be 

formulated in aqueous solutions, preferably in physiologically compatible buffers such as 
Hanks's solution, Ringer's solution, or physiological saline buffer. For transmucosal 
administration, penetrants appropriate to the barrier to be permeated are used in the 
formulation. Such penetrants are generally known in the art. 

10 For oral administration, the compounds can be formulated readily by combining 

the active compounds with phaimaceutically acceptable carriers well known in the art. 
Such carriers enable the compounds of the invention to be formulated as tablets, pills, 
dragees, capsules, liquids, gels, syrups, slurries, suspensions and the like, for oral 
ingestion by a patient to be treated. Pharmaceutical preparations for oral use can be 

15 obtained from a solid excipient, optionally grinding a resulting mixture, and processing 
the mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or 
dragee cores. Suitable excipients are, in particular, fillers such as sugars, including 
lactose, sucrose, mannitol, or sorbitol; cellulose preparations such as, for example, maize 
starch, wheat starch, rice starch, potato starch, gelatin, gum tragacanth, methyl cellulose, 

20 hydroxypropylmethyl-cellulose, sodium carboxymethylcellulose, and/or 

polyvinylpyrrolidone (PVP). If desired, disintegrating agents may be added, such as the 
cross-linked polyvinyl pyrrolidone, agar, or alginic acid or a salt thereof such as sodium 
alginate. Dragee cores are provided with suitable coatings. For this purpose, 
concentrated sugar solutions may be used, which may optionally contain gum arabic, talc, 

25 polyvinyl pyrrolidone, caibopol gel, polyethylene glycol, and/or titanium dioxide, lacquer 
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or pigments may 
be added to the tablets or dragee coatings for identification or to characterize different 
combinations of active compound doses. 

Pharmaceutical preparations which can be used orally include push-fit capsules 

30 made of gelatin, as well as soft, sealed capsules made of gelatin and a plasticizer, such as 
glycerol or sorbitol. The push-fit capsules can contain the active ingredients in admixture 
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with filler such as lactose, binders such as starches, and/or lubricants such as talc or 
magnesium stearate and, optionally, stabilizers. In soft capsules, the active compounds 
may be dissolved or suspended in suitable liquids, such as fatty oils, liquid paraffin, or 
liquid polyethylene glycols. In addition, stabilizers may be added All formulations for 
5 oral administration should be in dosages suitable for such administration. For buccal 
administration, the compositions may take the form of tablets or lozenges formulated in 
conventional manner. 

For administration by inhalation, the compounds for use according to the present 
invention are conveniently delivered in the form of an aerosol spray presentation from 

10 pressurized packs or a nebuliser, with the use of a suitable propellant, e.g., 

dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane, caibon 
dioxide or other suitable gas. In the case of a pressurized aerosol the dosage unit may be 
determined by providing a valve to deliver a metered amount. Capsules and cartridges 
of, e.g., gelatin for use in an inhaler or insufflator may be formulated containing a powder 

15 mix of the compound and a suitable powder base such as lactose or starch. The 

compounds may be formulated for parenteral administration by injection, e.g., by bolus 
injection or continuous infusion* Formulations for injection may be presented in unit 
dosage form, e.g., in ampules or in multi-dose containers, with an added preservative. 
The compositions may take such forms as suspensions, solutions or emulsions in oily or 

20 aqueous vehicles, and may contain fonnulatory agents such as suspending, stabilizing 
and/or dispersing agents. 

Pharmaceutical formulations for parenteral administration include aqueous 
solutions of the active compounds in water-soluble form. Additionally, suspensions of 
the active compounds may be prepared as appropriate oily injection suspensions. 

25 Suitable lipophilic solvents or vehicles include fatty oils such as sesame oil, or synthetic 
fatty acid esters, such as ethyl oleate or triglycerides, or liposomes. Aqueous injection 
suspensions may contain substances which increase the viscosity of the suspension, such 
as sodium carboxymethyl cellulose, sorbitol, or dextran. Optionally, the suspension may 
also contain suitable stabilizers or agents which increase the solubility of the compounds 

30 to allow for the preparation of highly concentrated solutions. Alternatively, the active 
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ingredient may be in powder form for constitution with a suitable vehicle, e.g., sterile 
pyrogen-free water, before use. 

The compounds may also be formulated in rectal compositions such as 
suppositories or retention enemas, e.g., containing conventional suppository bases such as 
5 cocoa butter or other glycerides. Li addition to the formulations described previously, the 
compounds may also be formulated as a depot preparation. Such long acting 
formulations may be administered by implantation (for example subcutaneously or 
intramuscularly) or by intramuscular injection. Thus, for example, the compounds may 
be formulated with suitable polymeric or hydrophobic materials (for example as an 

10 emulsion in an acceptable oil) or ion exchange resins, or as sparingly soluble derivatives, 
for example, as a sparingly soluble salt. 

A pharmaceutical carrier for the hydrophobic compounds of the invention is a co- 
solvent system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible 
organic polymer, and an aqueous phase. The co-solvent system may be the VPD 

15 co-solvent system. VPD is a solution of 3% w/v benzyl alcohol, 8% w/v of the nonpolar 
surfactant polysorbate 80, and 65% w/v polyethylene glycol 300, made up to volume in 
absolute ethanol. The VPD co-solvent system (VPD:5W) consists of VPD diluted 1:1 
with a 5% dextrose in water solution. This co-solvent system dissolves hydrophobic 
compounds well, and itself produces low toxicity upon systemic administration. 

20 Naturally, the proportions of a co-solvent system may be varied considerably without 
destroying its solubility and toxicity characteristics. Furthermore, the identity of the 
co-solvent components may be varied: for example, other low-toxicity nonpolar 
surfactants may be used instead of polysorbate 80; the fraction size of polyethylene 
glycol may be varied; other biocompatible polymers may replace polyethylene glycol, 

25 e.g. polyvinyl pyrrolidone; and other sugars or polysaccharides may substitute for 
dextrose. Alternatively, other delivery systems for hydrophobic pharmaceutical 
compounds may be employed. Liposomes and emulsions are well known examples of 
delivery vehicles or carriers for hydrophobic drugs. Certain organic solvents such as 
dimethylsulfoxide also may be employed, although usually at the cost of greater toxicity. 

30 Additionally, the compounds may be delivered using a sustained-release system, such as 
semipermeable matrices of solid hydrophobic polymers containing the therapeutic agent. 
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Various types of sustained-release materials have beeh established and are well known by 
those skilled in the art. Sustained-release capsules may, depending on their chemical 
nature, release the compounds for a few weeks up to over 100 days. Depending on the 
chemical nature and the biological stability of the therapeutic reagent, additional 
5 strategies for protein or other active ingredient stabilization may be employed. 

The pharmaceutical compositions also may comprise suitable solid or gel phase 
carriers or excipients. Examples of such carriers or excipients include but are not limited 
to calcium carbonate, calcium phosphate, various sugars, starches, cellulose derivatives, 
gelatin, and polymers such as polyethylene glycols. Many of the active ingredients of the 

10 invention may be provided as salts with pharmaceutically compatible counter ions. Such 
pharmaceutically acceptable base addition salts are those salts which retain the biological 
effectiveness and properties of the free acids and which are obtained by reaction with 
inorganic or organic bases such as sodium hydroxide, magnesium hydroxide, ammonia, 
trialkylamine, dialkylamine, monoalkylamine, dibasic amino acids, sodium acetate, 

1 5 potassium benzoate, triethanol amine and the like. 

The pharmaceutical composition of the invention may be in the form of a 
complex of the protein(s) or other active ingredient(s) of present invention along with 
protein or peptide antigens. The protein and/or peptide antigen will deliver a stimulatory 
signal to both B and T lymphocytes. B lymphocytes will respond to antigen through their 

20 surface immunoglobulin receptor. T lymphocytes will respond to antigen through the T 
cell receptor (TCR) following presentation of the antigen by MHC proteins. MHC and 
structurally related proteins including those encoded by class I and class II MHC genes 
on host cells will serve to present the peptide antigen(s) to T lymphocytes. The antigen 
components could also be supplied as purified MHC-peptide complexes alone or with 

25 co-stimulatory molecules that can directly signal T cells. Alternatively antibodies able to 
bind surface immunoglobulin and other molecules on B cells as well as antibodies able to 
bind the TCR and other molecules on T cells can be combined with the pharmaceutical 
composition of the invention. 

The pharmaceutical composition of the invention may be in the form of a 

30 liposome in which protein of the present invention is combined, in addition to other 

pharmaceutically acceptable carriers, with amphipathic agents such as lipids which exist 
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in aggregated form as micelles, insoluble monolayers, liquid crystals, or lamellar layers 
in aqueous solution. Suitable lipids for liposomal formulation include, without limitation, 
monoglycerides, diglycerides, sulfatides, lysolecithins, phospholipids, saponin, bile acids, 
and the like. Preparation of such liposomal formulations is within the level of skill in the 
5 art, as disclosed, for example, in U.S. Patent Nos. 4,235,871 ; 4,501,728; 4,837,028; and 
4,737,323, all of which are incorporated herein by reference. 

The amount of protein or other active ingredient of the present invention in the 
pharmaceutical composition of the present invention will depend upon the nature and 
severity of the condition being treated, and on the nature of prior treatments which the 

10 patient has undergone. Ultimately, the attending physician will decide the amount of 
protein or other active ingredient of the present invention with which to treat each 
individual patient Initially, the attending physician will administer low doses of protein 
or other active ingredient of the present invention and observe the patient's response. 
Larger doses of protein or other active ingredient of the present invention may be 

15 administered until the optimal therapeutic effect is obtained for the patient, and at that 
point the dosage is not increased further. It is contemplated that the various 
pharmaceutical compositions used to practice the method of the present invention should 
contain about 0.01 fig to about 100 mg (preferably about 0.1 fig to about 10 mg, more 
preferably about 0.1 \ig to about 1 mg) of protein or other active ingredient of the present 

20 invention per kg body weight. For compositions of the present invention which are 
useful for bone, cartilage, tendon or ligament regeneration, the therapeutic method 
includes administering the composition topically, systematically, or locally as an implant 
or device. When administered, the therapeutic composition for use in this invention is, of 
course, in a pyrogen-fiee, physiologically acceptable form. Further, the composition may 

25 desirably be encapsulated or injected in a viscous form for delivery to the site of bone, 
cartilage or tissue damage. Topical administration may be suitable for wound healing 
and tissue repair. Therapeutically useful agents other than a protein or other active 
ingredient of the invention which may also optionally be included in the composition as 
described above, may alternatively or additionally, be administered simultaneously or 

30 sequentially with the composition in the methods of the invention. Preferably for bone 
and/or cartilage formation, the composition would include a matrix capable of delivering 
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the protein-containing or other active ingredient-containing composition to the site of 
bone and/or cartilage damage, providing a structure for the developing bone and cartilage 
and optimally capable of being resorbed into the body. Such matrices may be formed of 
materials presently in use for other implanted medical applications, 
5 The choice of matrix material is based on biocompatibility, biodegradability, 

mechanical properties, cosmetic appearance and interface properties. The particular 
application of the compositions will define the appropriate formulation. Potential 
matrices for the compositions may be biodegradable and chemically defined calcium 
sulfate, tricalcium phosphate, hydroxyapatite, polylactic acid, polyglycolic acid and 

10 polyanhydrides. Other potential materials are biodegradable and biologically 

well-defined, such as bone or dermal collagen. Further matrices are comprised of pure 
proteins or extracellular matrix components. Other potential matrices are 
nonbiodegradable and chemically defined, such as sintered hydroxyapatite, bioglass, 
aluminates, or other ceramics. Matrices may be comprised of combinations of any of the 

15 above mentioned types of material, such as polylactic acid and hydroxyapatite or 

collagen and tricalcium phosphate. The bioceramics may be altered in composition, such 
as in calcium-aluminate-phosphate and processing to alter pore size, particle size, particle 
shape, and biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of 
lactic acid and glycolic acid in the form of porous particles having diameters ranging 

20 from 150 to 800 microns. In some applications, it will be useful to utilize a sequestering 
agent, such as carboxymethyl cellulose or autologous blood clot, to prevent the protein 
compositions from disassociating from the matrix. 

A preferred family of sequestering agents is cellulosic materials such as 
alkylcelluloses (including hydroxyalkylcelluloses), including methylcellulose, 

25 ethylcellulose, hydroxyethylcellulose, hydroxypropylcellulose, 

hydroxypropyl-methylcellulose, and carboxymethylcellulose, the most preferred being 
cationic salts of carboxymethylcellulose (CMC). Other preferred sequestering agents 
include hyaluronic acid, sodium alginate, poly(ethylene glycol), polyoxyethylene oxide, 
carboxyvinyl polymer and polyvinyl alcohol). The amount of sequestering agent useful 

30 herein is 0.5-20 wt %, preferably 1-10 wt % based on total formulation weight, which 
represents the amount necessary to prevent desorption of the protein from the polymer 
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matrix and to provide appropriate handling of the composition, yet not so much that the 
progenitor cells are prevented from infiltrating the matrix, thereby providing the protein 
the opportunity to assist the osteogenic activity of the progenitor cells. In further 
compositions, proteins or other active ingredients of the invention may be combined with 
5 other agents beneficial to the treatment of the bone and/or cartilage defect, wound, or 
tissue in question. These agents include various growth factors such as epidermal growth 
factor (EGF), platelet derived growth factor (PDGF), transforming growth factors 
(TGF-oc and TGF-p), and insulin-like growth factor (IGF). 

The therapeutic compositions are also presently valuable for veterinary 

10 applications. Particularly domestic animals and thoroughbred horses, in addition to 

humans, are desired patients for such treatment with proteins or other active ingredients 
of the present invention. The dosage regimen of a protein-containing pharmaceutical 
composition to be used in tissue regeneration will be determined by the attending 
physician considering various factors which modify the action of the proteins, e.g., 

15 amount of tissue weight desired to be formed, the site of damage, the condition of the 
damaged tissue, the size of a wound, type of damaged tissue (e.g., bone), the patients 
age, sex, and diet, the severity of any infection, time of administration and other clinical 
factors. The dosage may vary with the type of matrix used in the reconstitution and with 
inclusion of other proteins in the pharmaceutical composition. For example, the addition 

20 of other known growth factors, such as IGF I (insulin like growth factor I), to the final 
composition, may also effect the dosage. Progress can be monitored by periodic 
assessment of tissue/bone growth and/or repair, for example, X-rays, histomorphometric 
determinations and tetracycline labeling. 

Polynucleotides of the present invention can also be used for gene therapy. Such 

25 polynucleotides can be introduced either in vivo or ex vivo into cells for expression in a 
mammalian subject. Polynucleotides of the invention may also be administered by other 
known methods for introduction of nucleic acid into a cell or organism (including, 
without limitation, in the form of viral vectors or naked DNA). Cells may also be 
cultured ex vivo in the presence of proteins of the present invention in order to proliferate 

30 or to produce a desired effect on or activity in such cells. Treated cells can then be 
introduced in vivo for therapeutic purposes. 
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4.12.3 EFFECTIVE DOSAGE 

Pharmaceutical compositions suitable for use in the present invention include 
compositions wherein the active ingredients are contained in an effective amount to 
5 achieve its intended purpose. More specifically, a therapeutically effective amount 
means an amount effective to prevent development of or to alleviate the existing 
symptoms of the subject being treated Determination of the effective amount is well 
within the capability of those skilled in the art, especially in light of the detailed 
disclosure provided herein. For any compound used in the method of the invention, the 

10 therapeutically effective dose can be estimated initially from appropriate in vitro assays. 
For example, a dose can be formulated in animal models to achieve a circulating 
concentration range that can be used to more accurately determine useful doses in 
humans. For example, a dose can be formulated in animal models to achieve a 
circulating concentration range that includes the IC50 as determined in cell culture (/.*., 

15 the concentration of the test compound which achieves a half-maximal inhibition of the 
protein's biological activity). Such information can be used to more accurately determine 
useful doses in humans. 

A therapeutically effective dose refers to that amount of the compound that results 
in amelioration of symptoms or a prolongation of survival in a patient. Toxicity and 

20 therapeutic efficacy of such compounds can be determined by standard pharmaceutical 
procedures in cell cultures or experimental animals, e.g., for determining the LD50 (the 
dose lethal to 50% of the population) and the ED50 (the dose therapeutically effective in 
50% of the population). The dose ratio between toxic and therapeutic effects is the 
therapeutic index and it can be expressed as the ratio between LD50 and ED 50 . 

25 Compounds which exhibit high therapeutic indices are preferred. The data obtained from 
these cell culture assays and animal studies can be used in formulating a range of dosage 
for use in human. The dosage of such compounds lies preferably within a range of 
circulating concentrations that include the ED50 with little or no toxicity. Hie dosage 
may vary within this range depending upon the dosage form employed and the route of 

30 administration utilized. The exact formulation, route of administration and dosage can be 
chosen by the individual physician in view of the patient's condition. See, e.g. f Fingl et 
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al., 1975, in "The Pharmacological Basis of Therapeutics", Ch. 1 p.l. Dosage amount 

and interval may be adjusted individually to provide plasma levels of the active moiety 

which are sufficient to maintain the desired effects, or minimal effective concentration 

(MEC). The MEC will vary for each compound but can be estimated from in vitro data. 
5 Dosages necessary to achieve the MEC will depend on individual characteristics and 

route of administration. However, HPLC assays or bioassays can be used to determine 

plasma concentrations. 

Dosage intervals can also be determined using MEC value. Compounds should 

be administered using a regimen which maintains plasma levels above the MEC for 
10 10-90% of the time, preferably between 30-90% and most preferably between 50-90%. 

In cases of local administration or selective uptake, the effective local concentration of 

the drug may not be related to plasma concentration. 

An exemplary dosage regimen for polypeptides or other compositions of the 

invention will be in the range of about 0.01 fxg/kg to 100 mg/kg of body weight daily, 
15 with the preferred dose being about 0.1 Jig/kg to 25 mg/kg of patient body weight daily, 

varying in adults and children. Dosing may be once daily, or equivalent doses may be 

delivered at longer or shorter intervals. 

The amount of composition administered will, of course, be dependent on the 

subject being treated, on the subject's age and weight, the severity of the affliction, the 
20 manner of administration and the judgment of the prescribing physician. 

4.12.4 PACKAGING 

The compositions may, if desired, be presented in a pack or dispenser device 
which may contain one or more unit dosage forms containing the active ingredient. The 
25 pack may, for example, comprise metal or plastic foil, such as a blister pack. The pack or 
dispenser device may be accompanied by instructions for administration. Compositions 
comprising a compound of the invention formulated in a compatible pharmaceutical 
carrier may also be prepared, placed in an appropriate container, and labeled for 
treatment of an indicated condition. 

30 

4.13 ANTIBODIES 
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Also included in the invention are antibodies to proteins, or fragments of proteins 
of the invention. The term "antibody" as used herein refers to immunoglobulin molecules 
and immunologically active portions of immunoglobulin (Ig) molecules, i.e., molecules 
that contain an antigen-binding site that specifically binds (immunoreacts with) an 
5 antigen. Such antibodies include, but are not limited to, polyclonal, monoclonal, 
chimeric, single chain, F a b, Fab' and F( ab «)2 fragments, and an F a b expression library. In 
general, an antibody molecule obtained from humans relates to any of the classes IgG, 
IgM, IgA, IgE and IgD, which differ from one another by the nature of the heavy chain 
present in the molecule. Certain classes have subclasses as well, such as IgGi, IgG2, and 

10 others. Furthermore, in humans, the light chain may be a kappa chain or a lambda chain. 
Reference herein to antibodies includes a reference to all such classes, subclasses and 
types of human antibody species. 

An isolated related protein of the invention may be intended to serve as an 
antigen, or a portion or fragment thereof, and additionally can be used as an immunogen 

15 to generate antibodies that immunospecifically bind the antigen, using standard 

techniques for polyclonal and monoclonal antibody preparation. The full-length protein 
can be used or, alternatively, the invention provides antigenic peptide fragments of the 
antigen for use as immunogens. An antigenic peptide fragment comprises at least 6 
amino acid residues of the amino acid sequence of the full length protein, such as an 

20 amino acid sequence shown in SEQ ID NO: 1-438, and encompasses an epitope thereof 
such that an antibody raised against the peptide forms a specific immune complex with 
the full length protein or with any fragment that contains the epitope. Preferably, the 
antigenic peptide comprises at least 10 amino acid residues, or at least 15 amino acid 
residues, or at least 20 amino acid residues, or at least 30 amino acid residues. Preferred 

25 epitopes encompassed by the antigenic peptide are regions of the protein that are located 
on its surface; commonly these are hydrophilic regions. 

In certain embodiments of the invention, at least one epitope encompassed by the 
antigenic peptide is a region of alpha-2-macroglobulin-like protein that is located on the 
surface of the protein, e.g., a hydrophilic region. A hydrophobicity analysis of the human 

30 related protein sequence will indicate which regions of a related protein are particularly 
hydrophilic and, therefore, are likely to encode surface residues useful for targeting 
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antibody production. As a means for targeting antibody production, hydropathy plots 
showing regions of hydrophilicity and hydrophobicity may be generated by any method 
well known in the art, including, for example, the Kyte Doolittle or the Hopp Woods 
methods, either with or without Fourier transformation. See, e.g., Hopp and Woods, 
5 198 1, Proc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and Doolittle 1982, /. Mol Biol 
157: 105-142, each of which is incorporated herein by reference in its entirety. 
Antibodies that are specific for one or more domains within an antigenic protein, or 
derivatives, fragments, analogs or homologs thereof, are also provided herein. 

A protein of the invention, or a derivative, fragment, analog, homolog or ortholog 

10 thereof, may be utilized as an immunogen in the generation of antibodies that 
immunospecifically bind these protein components. 

The term "specific for" indicates that the variable regions of the antibodies of the 
invention recognize and bind polypeptides of the invention exclusively (Le., able to 
distinguish the polypeptide of the invention from other similar polypeptides despite 

1 5 sequence identity, homology, or similarity found in the family of polypeptides), but may 
also interact with other proteins (for example, S. aureus protein A or other antibodies in 
ELISA techniques) through interactions with sequences outside the variable region of the 
antibodies, and in particular, in the constant region of the molecule. Screening assays to 
determine binding specificity of an antibody of the invention are well known and 

20 routinely practiced in the art. For a comprehensive discussion of such assays, see Harlow 
etal. (Eds), Antibodies A Laboratory Manual; Cold Spring Harbor Laboratory; Cold 
Spring Harbor, NY (1988), Chapter 6. Antibodies that recognize and bind fragments of 
the polypeptides of the invention are also contemplated, provided that the antibodies are 
. first and foremost specific for, as defined above, full-length polypeptides of the 

25 invention. As with antibodies that are specific for full length polypeptides of the 
invention, antibodies of the invention that recognize fragments are those which can 
distinguish polypeptides from the same family of polypeptides despite inherent sequence 
identity, homology, or similarity found in the family of proteins. 

Antibodies of the invention are useful for, for example, therapeutic purposes (by 

30 modulating activity of a polypeptide of the invention), diagnostic purposes to detect or 
quantitate a polypeptide of the invention, as well as purification of a polypeptide of the 
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invention. Kits comprising an antibody of the invention for any of the purposes 
described herein are also comprehended In general, a kit of the invention also includes a 
control antigen for which the antibody is immunospecific. The invention further provides 
a hybridoma that produces an antibody according to the invention. Antibodies of the 
5 invention are useful for detection and/or purification of the polypeptides of the invention. 
Monoclonal antibodies binding to the protein of the invention may be useful 
diagnostic agents for the immunodetection of the protein. Neutralizing monoclonal 
antibodies binding to the protein may also be useful therapeutics for both conditions 
associated with the protein and also in the treatment of some forms of cancer where 

10 abnormal expression of the protein is involved. In the case of cancerous cells or 

leukemic cells, neutralizing monoclonal antibodies against the protein may be useful in 
detecting and preventing the metastatic spread of the cancerous cells, which may be 
mediated by the protein. 

The labeled antibodies of the present invention can be used for in vitro, in vivo, 

15 and in situ assays to identify cells or tissues in which a fragment of the polypeptide of 
interest is expressed. The antibodies may also be used directly in therapies or other 
diagnostics. The present invention further provides the above-described antibodies 
immobilized on a solid support. Examples of such solid supports include plastics such as 
polycarbonate, complex carbohydrates such as agarose and Sepharose®, acrylic resins 

20 and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such 
solid supports are well known in the art (Weir, D.M. et al., "Handbook of Experimental 
Immunology" 4th Ed., Blackwell Scientific Publications, Oxford, England, Chapter 10 
(1986); Jacoby, W.D. et al., Meth. Enzym. 34 Academic Press, N.Y. (1974)). The 
immobilized antibodies of the present invention can be used for in vitro, in vivo, and in 

25 situ assays as well as for immuno-affinity purification of the proteins of the present 
invention. 

Various procedures known within the art may be used for the production of 
polyclonal or monoclonal antibodies directed against a protein of the invention, or against 
derivatives, fragments, analogs homologs or orthologs thereof (see, for example, 
30 Antibodies: A Laboratory Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor 
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Laboratory Press, Cold Spring Harbor, NY, incorporated herein by reference). Some of 
these antibodies are discussed below. 

4.13.1 POLYCLONAL ANTIBODIES 

5 For the production of polyclonal antibodies, various suitable host animals (e.g., 

rabbit, goat, mouse or other mammal) may be immunized by one or more injections with 
the native protein, a synthetic variant thereof, or a derivative of the foregoing. An 
appropriate immunogenic preparation can contain, for example, the naturally occurring 
immunogenic protein, a chemically synthesized polypeptide representing the 

10 immunogenic protein, or a recombinantly expressed immunogenic protein. Furthermore, 
the protein may be conjugated to a second protein known to be immunogenic in the 
mammal being immunized. Examples of such immunogenic proteins include but are not 
limited to keyhole limpet hemocyanin, serum albumin, bovine thyroglobulin, and 
soybean trypsin inhibitor. The preparation can further include an adjuvant Various 

15 adjuvants used to increase the immunological response include, but are not limited to, 
Freund's (complete and incomplete), mineral gels (e.g., aluminum hydroxide), surface- 
active substances (e.g., lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, 
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-Guerin and 
Corynebacterium parvum, or similar immunostimulatory agents. Additional examples of 

20 adjuvants that can be employed include MPLTDM adjuvant (monophosphoryl lipid A, 
synthetic trehalose dicorynomycolate). 

The polyclonal antibody molecules directed against the immunogenic protein can 
be isolated from the mammal (e.g., from the blood) and further purified by well known 
techniques, such as affinity chromatography using protein A or protein G, which provide 

25 primarily the IgG fraction of immune serum. Subsequently, or alternatively, the specific 
antigen which is the target of the immunoglobulin sought, or an epitope thereof, may be 
immobilized on a column to purify the immune specific antibody by immunoaffinity 
chromatography. Purification of immunoglobulins is discussed, for example, by D. 
Wilkinson (The Scientist, published by The Scientist, Inc., Philadelphia PA, Vol. 14, No. 

30 8 (April 17, 2000), pp. 25-28). 



93 



WO 02/081731 



PCT7US02/01222 



4.13.2 MONOCLONAL ANTIBODIES 

The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", 
as used herein, refers to a population of antibody molecules that contain only one 
molecular species of antibody molecule consisting of a unique light chain gene product 
5 and a unique heavy chain gene product. In particular, the complementarity determining 
regions (CDRs) of the monoclonal antibody are identical in all the molecules of the 
population. MAbs thus contain an antigen-binding site capable of immunoreacting with a 
particular epitope of the antigen characterized by a unique binding affinity for it. 

Monoclonal antibodies can be prepared using hybridoma methods, such as those 

10 described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma method, a 
mouse, hamster, or other appropriate host animal, is typically immunized with an 
immunizing agent to elicit lymphocytes that produce or are capable of producing 
antibodies that will specifically bind to the immunizing agent. Alternatively, the 
v lymphocytes can be immunized in vitro. 

15 The immunizing agent will typically include the protein antigen, a fragment 

thereof or a fusion protein thereof. Generally, either peripheral blood lymphocytes are 
used if cells of human origin are desired, or spleen cells or lymph node cells are used if 
non-human mammalian sources are desired. The lymphocytes are then fused with an 
immortalized cell line using a suitable fusing agent, such as polyethylene glycol, to form 

20 a hybridoma cell (Goding, Monoclonal Antibodies: Principles and Practice, Academic 
Press, (1986) pp. 59-103). Immortalized cell lines are usually transformed mammalian 
cells, particularly myeloma cells of rodent, bovine and human origin. Usually, rat or 
mouse myeloma cell lines are employed. The hybridoma cells can be cultured in a 
suitable culture medium that preferably contains one or more substances that inhibit the 

25 growth or survival of the unf used, immortalized cells. For example, if the parental cells 
lack the enzyme hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), 
the culture medium for the hybridomas typically will include hypoxanthine, aminopterin, 
and thymidine ("HAT medium"), which substances prevent the growth of HGPRT- 
deficient cells. 

30 Preferred immortalized cell lines are those that fuse efficiently, support stable 

high level expression of antibody by the selected antibody-producing cells, and are 
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sensitive to a medium such as HAT medium. More preferred immortalized cell lines are 
murine myeloma lines, which can be obtained, for instance, from the Salk Institute Cell 
Distribution Center, San Diego, California and the American Type Culture Collection, 
Manassas, Virginia. Human myeloma and mouse-human heteromyeloma cell lines also 
5 have been described for the production of human monoclonal antibodies (Kozbor, J. 

Immunol., 133:3001 (1984); Brodeur et aL, Monoclonal Antibody Production Techniques 
and Applications. Marcel Dekker, Inc., New York, (1987) pp. 51-63). 

The culture medium in which the hybridoma cells are cultured can then be 
assayed for the presence of monoclonal antibodies directed against the antigen. 

10 Preferably, the binding specificity of monoclonal antibodies produced by the hybridoma 
cells is determined by immunoprecipitation or by an in vitro binding assay, such as 
radioimmunoassay (RIA) or enzyme-linked immunoabsorbent assay (ELLS A). Such 
techniques and assays are known in the art. The binding affinity of the monoclonal 
antibody can, for example, be determined by the Scatchard analysis of Munson and 

15 Pollard, Anal. Biochem. , 107:220 (1980). Preferably, antibodies having a high degree of 
specificity and a high binding affinity for the target antigen are isolated. 

After the desired hybridoma cells are identified, the clones can be subcloned by 
limiting dilution procedures and grown by standard methods. Suitable culture media for 
this purpose include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 

20 medium. Alternatively, the hybridoma cells can be grown in vivo as ascites in a 
mammal. 

The monoclonal antibodies secreted by the subclones can be isolated or purified 
from the culture medium or ascites fluid by conventional immunoglobulin purification 
procedures such as, for example, protein A-Sepharose, hydroxylapatite chromatography, 

25 gel electrophoresis, dialysis, or affinity chromatography. 

The monoclonal antibodies can also be made by recombinant DNA methods, such 
as those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal 
antibodies of the invention can be readily isolated and sequenced using conventional 
procedures (e.g., by using oligonucleotide probes that are capable of binding specifically 

30 to genes encoding the heavy and light chains of murine antibodies). The hybridoma cells 
of the invention serve as a preferred source of such DNA. Once isolated, the DNA can 
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be placed into expression vectors, which are then transfected into host cells such as 
simian COS cells, Chinese hamster ovary (CHO) cells, or myeloma cells that do not 
otherwise produce immunoglobulin protein, to obtain the synthesis of monoclonal 
antibodies in the recombinant host cells. The DNA also can be modified, for example, by 
5 substituting the coding sequence for human heavy and light chain constant domains in 
. place of the homologous murine sequences (U.S. Patent No. 4,816,567; Morrison, Nature 
368 , 812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence all 
4 or part of the coding sequence for a non-immunoglobulin polypeptide. Such a non- 
immunoglobulin polypeptide can be substituted for the constant domains of an antibody 
10 of the invention, or can be substituted for the variable domains of one antigen-combining 
site of an antibody of the invention to create a chimeric bivalent antibody. 

4.13.3 HUMANIZED ANTIBODIES 

The antibodies directed against the protein antigens of the invention can further 

15 comprise humanized antibodies or human antibodies. These antibodies are suitable for 
administration to humans without engendering an immune response by the human against 
the administered immunoglobulin. Humanized forms of antibodies are chimeric 
immunoglobulins, immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab 1 , 
F(ab')2 or other antigen-binding subsequences of antibodies) that are principally 

20 comprised of the sequence of a human immunoglobulin, and contain minimal sequence 
derived from a non-human immunoglobulin. Humanization can be performed following 
the method of Winter and co-workers (Jones et al., Nature , 321 :522-525 (1986); 
Riechmann et al. f Nature . 332:323-327 (1988); Verhoeyen et al., Science. 239:1534-1536 
(1988)), by substituting rodent CDRs or CDR sequences for the corresponding sequences 

25 of a human antibody. (See also U.S. Patent No. 5,225,539). In some instances, Fv 

framework residues of the human immunoglobulin are replaced by corresponding non- 
human residues. Humanized antibodies can also comprise residues that are found neither 
in the recipient antibody nor in the imported CDR or framework sequences. In general, 
the humanized antibody will comprise substantially all of at least one, and typically two, 

30 variable domains, in which all or substantially all of the CDR regions correspond to those 
of a non-human immunoglobulin and all or substantially all of the framework regions are 
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those of a human immunoglobulin consensus sequence. The humanized antibody 
optimally also will comprise at least a portion of an immunoglobulin constant region 
(Fc), typically that of a human immunoglobulin (Jones et al., 1986; Riechmann et al., 
1988; and Presta, Curr. Op. Struct. Biol. . 2:593-596 (1992)). 

5 

4,13.4 HUMAN ANTIBODIES 

Fully human antibodies relate to antibody molecules in which essentially the 
* entire sequences of both the light chain and the heavy chain, including the CDRs, arise 
from human genes. Such antibodies are termed "human antibodies", or "fully human 

10 antibodies" herein. Human monoclonal antibodies can be prepared by the trioma 
technique; the human B-cell hybridoma technique (see Kozbor, et al., 1983 Immunol 
Today 4: 72) and the EB V hybridoma technique to produce human monoclonal 
antibodies (see Cole, et al., 1985 In: MONOCLONAL ANTIBODIES AND CANCER THERAPY, 
Alan R. Iiss, Inc., pp. 77-96). Human monoclonal antibodies may be utilized in the 

15 practice of the present invention and may be produced by using human hybridomas (see 
Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-2030) or by transforming human 
B-cells with Epstein Bait Virus in vitro (see Cole, et al., 1985 In: Monoclonal 
Antibodies and Cancer Therapy, Alan R. Iiss, Inc., pp. 77-96). 

In addition, human antibodies can also be produced using additional techniques, 

20 including phage display libraries (Hoogenboom and Winter, J. Mol. Biol.. 227:381 

(1991); Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can be 
made by introducing human immunoglobulin loci into transgenic animals, e.g., mice in 
which the endogenous immunoglobulin genes have been partially or completely 
inactivated. Upon challenge, human antibody production is observed, which closely 

25 resembles that seen in humans in all respects, including gene rearrangement, assembly, 
and antibody repertoire. This approach is described, for example, in U.S. Patent Nos. 
5,545,807; 5,545,806; 5,569,825; 5,625,126; 5,633,425; 5,661,016, and in Marks et al. 
(Bio/Technology 10, 779-783 (1992)); Lonberg et al. (Nature 368 856-859 (1994)); 
Morrison (Nature 368, 812-13 (1994)); Fishwild et al, (Nature Biotechnology 14, 845-51 

30 (1996)); Neuberger (Nature Biotechnology 14, 826 (1996)); and Lonberg and Huszar 
(Intern. Rev. Immunol. 13 65-93 (1995)). 
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Human antibodies may additionally be produced using transgenic nonhuman 
animals that are modified so as to produce fully human antibodies rather than the 
animal's endogenous antibodies in response to challenge by an antigen. (See PCT 
publication WO94/02602). The endogenous genes encoding the heavy and light 
5 immunoglobulin chains in the nonhuman host have been incapacitated, and active loci 
encoding human heavy and light chain immunoglobulins are inserted into the host's 
genome. The human genes are incorporated, for example, using yeast artificial 
chromosomes containing the requisite human DNA segments. An animal which provides 
all the desired modifications is then obtained as progeny by crossbreeding intermediate 

10 transgenic animals containing fewer than the full complement of the modifications. The 
preferred embodiment of such a nonhuman animal is a mouse, and is termed the 
Xenomouse™ as disclosed in PCT publications WO 96/33735 and WO 96/34096. This 
animal produces B cells that secrete fully human immunoglobulins. The antibodies can 
be obtained directly from the animal after immunization with an immunogen of interest, 

15 as, for example, a preparation of a polyclonal antibody, or alternatively from 

immortalized B cells derived from the animal, such as hybridomas producing monoclonal 
antibodies. Additionally, the genes encoding the immunoglobulins with human variable 
regions can be recovered and expressed to obtain the antibodies directly, or can be further 
modified to obtain analogs of antibodies such as, for example, single chain Fv molecules. 

20 An example of a method of producing a nonhuman host, exemplified as a mouse, 

lacking expression of an endogenous immunoglobulin heavy chain is disclosed in U.S. 
Patent No. 5,939,598. It can be obtained by a method including deleting the J segment 
genes from at least one endogenous heavy chain locus in an embryonic stem cell to 
prevent rearrangement of the locus and to prevent formation of a transcript of a 

25 rearranged immunoglobulin heavy chain locus, the deletion being effected by a targeting 
vector containing a gene encoding a selectable marker; and producing from the 
embryonic stem cell a transgenic mouse whose somatic and germ cells contain the gene 
encoding the selectable marker. 

A method for producing an antibody of interest, such as a human antibody, is 

30 disclosed in U.S. Patent No. 5,916,771. It includes introducing an expression vector that 
contains a nucleotide sequence encoding a heavy chain into one mammalian host cell in 
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culture, introducing an expression vector containing a nucleotide sequence encoding a 
light chain into another mammalian host cell, and fusing the two cells to form a hybrid 
cell. The hybrid cell expresses an antibody containing the heavy chain and the light 
chain. 

5 In a further improvement on this procedure, a method for identifying a clinically 

relevant epitope on an immunogen, and a correlative method for selecting an antibody 
that binds immunospecifically to the relevant epitope with high affinity, are disclosed in 
PCT publication WO 99/53049. 

10 4.13.5 FAB FRAGMENTS AND SINGLE CHAIN ANTIBODIES 

According to the invention, techniques can be adapted for the production of 
single-chain antibodies specific to an antigenic protein of the invention (see e.g., U.S. 
Patent No. 4,946,778). In addition, methods can be adapted for the construction of F a b 
expression libraries (see e.g., Huse, et al., 1989 Science 246: 1275-1281) to allow rapid 

15 and effective identification of monoclonal F ab fragments with the desired specificity for a 
protein or derivatives, fragments, analogs or homologs thereof. Antibody fragments that 
contain the idiotypes to a protein antigen may be produced by techniques known in the 
art including, but not limited to: (i) an F^ fragment produced by pepsin digestion of an 
antibody molecule; (ii) an F a b fragment generated by reducing the disulfide bridges of an 

20 F ( ab , )2 fragment; (iii) an F a b fragment generated by the treatment of the antibody molecule 
with papain and a reducing agent and (iv) F v fragments. 

4.13.6 BISPECIFIC ANTIBODIES 

Bispecific antibodies are monoclonal, preferably human or humanized, antibodies 
25 that have binding specificities for at least two different antigens. In the present case, one 
of the binding specificities is for an antigenic protein of the invention. The second 
binding target is any other antigen, and advantageously is a cell-surface protein or 
receptor or receptor subunit 

Methods for making bispecific antibodies are known in the art. Traditionally, the 
30 recombinant production of bispecific antibodies is based on the co-expression of two 
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have 
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different specificities (MMstein and Cuello, Nature. 305:537-539 (1983)). Because of the 
random assortment of immunoglobulin heavy and light chains, these hybridomas 
(quadromas) produce a potential mixture of ten different antibody molecules, of which 
only one has the correct bispecific structure. The purification of the correct molecule is 
5 usually accomplished by affinity chromatography steps. Similar procedures are disclosed 
in WO 93/08829, published 13 May 1993, and in Traunecker et a/., 1991 EMBO 
10:3655-3659. 

Antibody variable domains with the desired binding specificities (antibody- 
antigen combining sites) can be fused to immunoglobulin constant domain sequences. 

10 The fusion preferably is with an immunoglobulin heavy-chain constant domain, 

comprising at least part of the hinge, CH2, and CH3 regions. It is preferred to have the 
first heavy-chain constant region (CHI) containing the site necessary for light-chain 
binding present in at least one of the fusions. DNAs encoding the immunoglobulin 
heavy-chain fusions and, if desired, the immunoglobulin light chain, are inserted into 

15 separate expression vectors, and are co-transfected into a suitable host organism. For 

further details of generating bispecific antibodies see, for example, Suresh et al., Methods 
inEnzvmology, 121:210 (1986). 

According to another approach described in WO 96/2701 1 , the interface between 
a pair of antibody molecules can be engineered to maximize the percentage of 

20 heterodimers that are recovered from recombinant cell culture. The preferred interface 
comprises at least a part of the CH3 region of an antibody constant domain. In this 
method, one or more small amino acid side chains from the interface of the first antibody 
molecule are replaced with larger side chains (e.g. tyrosine or tryptophan). 
Compensatory "cavities" of identical or similar size to the large side chain(s) are created 

25 on the interface of the second antibody molecule by replacing large amino acid side 
chains with smaller ones (e.g. alanine or threonine). This provides a mechanism for 
increasing the yield of the heterodimer over other unwanted end-products such as 
homodimers. 

Bispecific antibodies can be prepared as full length antibodies or antibody 
30 fragments (e.g. F(ab*)2 bispecific antibodies). Techniques for generating bispecific 

antibodies from antibody fragments have been described in the literature. For example, 
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bispecific antibodies can be prepared using chemical linkage. Brennan et al., Science 
229:81 (1985) describe a procedure wherein intact antibodies are proteolytically cleaved 
to generate F(ab'>2 fragments. These fragments are reduced in the presence of the dithiol 
complexing agent sodium arsenite to stabilize vicinal dithiols and prevent intennolecular 
5 disulfide formation. The Fab' fragments generated are then converted to 

thionitrobenzoate (TNB) derivatives. One of the Fab'-TNB derivatives is then 
reconverted to the Fab' -thiol by reduction with mercaptoethylamine and is mixed with an 
equimolar amount of the other Fab' -TNB derivative to form the bispecific antibody. The 
bispecific antibodies produced can be used as agents for the selective immobilization of 
10 enzymes. 

Additionally, Fab* fragments can be directly recovered from E. coli and 
chemically coupled to form bispecific antibodies. Shalaby et al., J. Exp. Med. 
175:217-225 (1992) describe the production of a fully humanized bispecific antibody 
F(ab')2 molecule. Each Fab' fragment was separately secreted from E. coli and subjected 

15 to directed chemical coupling in vitro to form the bispecific antibody. The bispecific 
antibody thus formed was able to bind to cells overexpressing the ErbB2 receptor and 
normal human T cells, as well as trigger the lytic activity of human cytotoxic 
lymphocytes against human breast tumor targets. 

Various techniques for making and isolating bispecific antibody fragments 

20 directly from recombinant cell culture have also been described. For example, bispecific 
antibodies have been produced using leucine zippers. Kostelny et al., J. Immunol. 
148(5): 1547-1553 (1992). The leucine zipper peptides from the Fos and Jun proteins 
were linked to the Fab' portions of two different antibodies by gene fusion. The antibody 
homodimers were reduced at the hinge region to form monomers and then re-oxidized to 

25 form the antibody heterodimers. This method can also be utilized for the production of 
antibody homodimers. The "diabody" technology described by Hollinger et al., Proc. 
Nad. Acad. Sci. USA 90:6444-6448 (1993) has provided an alternative mechanism for 
making bispecific antibody fragments. The fragments comprise a heavy-chain variable 
domain (Vh) connected to a light-chain variable domain (Vl) by a linker which is too 

30 short to allow pairing between the two domains on the same chain. Accordingly, the Vh 
and Vl domains of one fragment are forced to pair with the complementary Vl and V H 
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domains of another fragment, thereby forming two antigen-binding sites. Another 
strategy for making bispecific antibody fragments by the use of single-chain Fv (sFv) 
dimers has also been reported. See, Gruber et al., J. Immunol. 152:5368 (1994). 
Antibodies with more than two valencies are contemplated. For example, 
5 trispecific antibodies can be prepared. Tutt et al., J. Immunol. 147:60 (1991). 

Exemplary bispecific antibodies can bind to two different epitopes, at least one of 
which originates in the protein antigen of the invention. Alternatively, an anti-antigenic 
arm of an immunoglobulin molecule can be combined with an arm which binds to a 
triggering molecule on a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, 

10 CD28, or B7), or Fc receptors for IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and 
FcyRHI (CD16) so as to focus cellular defense mechanisms to the cell expressing the 
particular antigen. Bispecific antibodies can also be used to direct cytotoxic agents to 
cells which express a particular antigen. These antibodies possess an antigen-binding 
arm and an arm which binds a cytotoxic agent or a radionuclide chelator, such as 

15 EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of interest binds the 
protein antigen described herein and further binds tissue factor (TF). 

4.13.7 HETEROCONJUGATE ANTIBODIES 

Heteroconjugate antibodies are also within the scope of the present invention. 

20 Heteroconjugate antibodies are composed of two covalently joined antibodies. Such 
antibodies have, for example, been proposed to target immune system cells to unwanted 
cells (U.S. Patent No. 4,676,980), and for treatment of HIV infection (WO 91/00360; 
WO 92/200373; EP 03089). It is contemplated that the antibodies can be prepared in 
vitro using known methods in synthetic protein chemistry, including those involving 

25 crosslinking agents. For example, immunotoxins can be constructed using a disulfide 
exchange reaction or by forming a thioether bond. Examples of suitable reagents for this 
purpose include iminothiolate and methyl-4-mercaptobutyrimidate and those disclosed, 
for example, in U.S. Patent No. 4,676,980. 

30 4.13.8 EFFECTOR FUNCTION ENGINEERING 
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It can be desirable to modify the antibody of the invention with respect to effector 
function, so as to enhance, e.g., the effectiveness of the antibody in treating cancer. For 
example, cysteine residue(s) can be introduced into the Fc region, thereby allowing 
interchain disulfide bond formation in this region. The homodimeric antibody thus 
5 generated can have improved internalization capability and/or increased complement- 
mediated cell killing and antibody-dependent cellular cytotoxicity (ADCC). See Caron 
et al, J. Exp Med., 176: 1 191-1 195 (1992) and Shopes, J. Immunol, 148: 2918-2922 
(1992). Homodimeric antibodies with enhanced anti-tumor activity can also be prepared 
using heterobifunctional cross-linkers as described in Wolff et al. Cancer Research, 53: 
10 2560-2565(1993). Alternatively, an antibody can be engineered that has dual Fc 

regions and can thereby have enhanced complement lysis and ADCC capabilities. See 
Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989). 

4.13,9 IMMUNOCONJUGATES 

15 The invention also pertains to immunoconjugates comprising an antibody 

conjugated to a cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an 
enzymatically active toxin of bacterial, fungal, plant, or animal origin, or fragments 
thereof), or a radioactive isotope (i.e., a radioconjugate). 

Chemotherapeutic agents useful in the generation of such immunoconjugates have 

20 been described above. Enzymatically active toxins and fragments thereof that can be 
used include diphtheria A chain, nonbinding active fragments of diphtheria toxin, 
exotoxin A chain (from Pseudomonas aeruginosa), ricin A chain, abrin A chain, 
modeccin A chain, alpha-sarcin, Aleurites fordii proteins, dianthin proteins, Phytolaca 
americana proteins (PAPI, PAPII, and PAP-S), momordica charantia inhibitor, curcin, 

25 crotin, sapaonaria officinalis inhibitor, gelonin, mitogellin, restrictocin, phenomycin, 
enpmycin, and the tricothecenes. A variety of radionuclides are available for the 
production of radioconjugated antibodies. Examples include 212 Bi, l31 1, 131 In, 90 Y, and 
186 Re. 

Conjugates of the antibody and cytotoxic agent are made using a variety of 
30 Afunctional protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) 

propionate (SPDP), iminothiolane (IT), bifunctional derivatives of imidoesters (such as 
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dimethyl adipimidate HCL), active esters (such as disuccinimidyl suberate), aldehydes 
(such as glutareidehyde), bis-azido compounds (such as bis (p-azidobenzoyl) 
hexanediamine), bis-diazonium derivatives (such as bis-(p-diazoniumbenzoyl)- 
ethylenediamine), diisocyanates (such as tolyene 2,6-diisocyanate), and bis-active 
5 fluorine compounds (such as l,5-difluoro-2,4-dinitrobenzene). For example, a ricin 
immunotoxin can be prepared as described in Vitetta et al., Science, 238: 1098 (1987). 
Carbon- 14-labeled 14sothiocyanatobenzyl-3-methyldiethylene triaminepentaacetic acid 
(MX-DTPA) is an exemplary chelating agent for conjugation of radionucleotide to the 
antibody. See W094/1 1026. 
10 In another embodiment, the antibody can be conjugated to a "receptor" (such 

streptavidin) for utilization in tumor pretargeting wherein the antibody-receptor conjugate 
is administered to the patient, followed by removal of unbound conjugate from the 
circulation using a clearing agent and then administration of a "ligand" (e.g., avidin) that 
is in turn conjugated to a cytotoxic agent. 

15 

4,14 COMPUTER READABLE SEQUENCES 
In one application of this embodiment, a nucleotide sequence of the present 
invention can be recorded on computer readable media. As used herein, "computer 

20 readable media" refers to any medium which can be read and accessed directly by a 
computer. Such media include, but are not limited to: magnetic storage media, such as 
floppy discs, hard disc storage medium, and magnetic tape; optical storage media such as 
CD-ROM; electrical storage media such as RAM and ROM; and hybrids of these 
categories such as magnetic/optical storage media. A skilled artisan can readily 

25 appreciate how any of the presently known computer readable mediums can be used to 
create a manufacture comprising computer readable medium having recorded thereon a 
nucleotide sequence of the present invention. As used herein, "recorded" refers to a 
process for storing information on computer readable medium. A skilled artisan can 
readily adopt any of the presently known methods for recording information on computer 

30 readable medium to generate manufactures comprising the nucleotide sequence 
information of the present invention. 
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A variety of data storage structures are available to a skilled artisan for creating a 
computer readable medium having recorded thereon a nucleotide sequence of the present 
invention. The choice of the data storage structure will generally be based on the means 
chosen to access the stored information. In addition, a variety of data processor programs 
5 and formats can be used to store the nucleotide sequence information of the present 
invention on computer readable medium. The sequence information can be represented 
in a word processing text file, formatted in commercially-available software such as 
WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored in a 
database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can 

10 readily adapt any number of data processor structuring formats (e.g. text file or database) 
in order to obtain computer readable medium having recorded thereon the nucleotide 
sequence information of the present invention. 

By providing any of the nucleotide sequences SEQ ID NOs: 1 - 438 or a 
representative fragment thereof; or a nucleotide sequence at least 95% identical to any of 

15 the nucleotide sequences of SEQ ID NOs: 1 - 438 in computer readable form, a skilled 
artisan can routinely access the sequence information for a variety of purposes. 
Computer software is publicly available which allows a skilled artisan to access sequence 
information provided in a computer readable medium. The examples which follow 
demonstrate how software which implements the BLAST (Altschul et al., J. Mol. Biol. 

20 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem. 17:203-207 (1993)) 
search algorithms on a Sybase system is used to identify open reading frames (ORFs) 
within a nucleic acid sequence. Such ORFs may be protein encoding fragments and may 
be useful in producing commercially important proteins such as enzymes used in 
fermentation reactions and in the production of commercially useful metabolites. 

25 As used herein, "a computer-based system" refers to the hardware means, 

software means, and data storage means used to analyze the nucleotide sequence 
information of the present invention. The minimum hardware means of the 
computer-based systems of the present invention comprises a central processing unit 
(CPU), input means, output means, and data storage means. A skilled artisan can readily 

30 appreciate that any one of the currently available computer-based systems are suitable for 
use in the present invention. As stated above, the computer-based systems of the present 
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invention comprise a data storage means having stored therein a nucleotide sequence of 
the present invention and the necessary hardware means and software means for 
supporting and implementing a search means. As used herein, "data storage means" 
refers to memory which can store nucleotide sequence information of the present 
5 invention, or a memory access means which can access manufactures having recorded 
thereon the nucleotide sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are 
implemented on the computer-based system to compare a target sequence or target 
structural motif with the sequence information stored within the data storage means. 

10 Search means are used to identify fragments or regions of a known sequence which 
match a particular target sequence or target motif. A variety of known algorithms are 
disclosed publicly and a variety of commercially available software for conducting search 
means are and can be used in the computer-based systems of the present invention. 
Examples of such software includes, but is not limited to, Smith- Waterman, MacPattern 

15 (EMBL), BLASTN and BLASTA (NPOLYPEFITDEIA). A skilled artisan can readily 
recognize that any one of the available algorithms or implementing software packages for 
conducting homology searches can be adapted for use in the present computer-based 
systems. As used herein, a "target sequence" can be any nucleic acid or amino acid 
sequence of six or more nucleotides or two or more amino acids. A skilled artisan can 

20 readily recognize that the longer a target sequence is, the less likely a target sequence will 
be present as a random occurrence in the database. The most preferred sequence length 
of a target sequence is from about 10 to 300 amino acids, more preferably from about 30 
to 100 nucleotide residues. However, it is well recognized that searches for 
commercially important fragments, such as sequence fragments involved in gene 

25 expression and protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any 
rationally selected sequence or combination of sequences in which the sequence(s) are 
chosen based on a three-dimensional configuration which is formed upon the folding of 
the target motif. There are a variety of target motifs known in the art. Protein target 

30 motifs include, but are not limited to, enzyme active sites and signal sequences. Nucleic 
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acid target motifs include, but are not limited to, promoter sequences, hairpin structures 
and inducible expression elements (protein binding sequences). 

4.15 TRIPLE HELIX FORMATION 

5 In addition, the fragments of the present invention, as broadly described, can be 

used to control gene expression through triple helix formation or antisense DNA or RNA, 
both of which methods are based on the binding of a polynucleotide sequence to DNA or 
RNA. Polynucleotides suitable for use in these methods are preferably 20 to 40 bases in 
length and are designed to be complementary to a region of the gene involved in 

10 transcription (triple helix - see Lee et al„ NucL Acids Res. 6:3073 (1979); Cooney et aL, 
Science 15241:456 (1988); andDervan et al., Science 251:1360 (1991)) or to the mRNA 
itself (antisense - Olmno, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as 
Antisense Inhibitors of Gene Expression, CRC Press, Boca Raton, EL (1988)), Triple 
helix-formation optimally results in a shut-off of RNA transcription from DNA, while 

15 antisense RNA hybridization blocks translation of an mRNA molecule into polypeptide. 
Both techniques have been demonstrated to be effective in model systems. Information 
contained in the sequences of the present invention is necessary for the design of an 
antisense or triple helix oligonucleotide. 

20 4.16 DIAGNOSTIC ASSAYS AND KITS 

The present invention further provides methods to identify the presence or 
expression of one of the ORFs of the present invention, or homolog thereof, in a test 
sample, using a nucleic acid probe or antibodies of the present invention, optionally 
conjugated or otherwise associated with a suitable label. 

25 In general, methods for detecting a polynucleotide of the invention can comprise 

contacting a sample with a compound that binds to and forms a complex with the 
polynucleotide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polynucleotide of the invention is detected in the sample. 
Such methods can also comprise contacting a sample under stringent hybridization 

30 conditions with nucleic acid primers that anneal to a polynucleotide of the invention 
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under such conditions, and amplifying annealed polynucleotides, so that if a 
polynucleotide is amplified, a polynucleotide of the invention is detected in the sample. 

In general, methods for detecting a polypeptide of the invention can comprise 
contacting a sample with a compound that binds to and forms a complex with the 
5 polypeptide for a period sufficient to form the complex, and detecting the complex, so 
that if a complex is detected, a polypeptide of the invention is detected in the sample. 

In detail, such methods comprise incubating a test sample with one or more of the 
antibodies or one or more of the nucleic acid probes of the present invention and assaying 
for binding of the nucleic acid probes or antibodies to components within the test sample. 

10 Conditions for incubating a nucleic acid probe or antibody with a test sample 

vary. Incubation conditions depend on the format employed in the assay, the detection 
methods employed, and the type and nature of the nucleic acid probe or antibody used in 
the assay. One skilled in the art will recognize that any one of the commonly available 
hybridization, amplification or immunological assay formats can readily be adapted to 

15 employ the nucleic acid probes or antibodies of the present invention. Examples of such 
assays can be found in Chard, T., An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, 
G.R. et al., Techniques in Immunocytochemistry, Academic Press, Orlando, EL Vol. 1 
(1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen, P., Practice and Theory of immunoassays: 

20 Laboratory Techniques in Biochemistry and Molecular Biology, Elsevier Science 
Publishers, Amsterdam, The Netherlands (1985). The test samples of the present 
invention include cells, protein or membrane extracts of cells, or biological fluids such as 
sputum, blood, serum, plasma, or urine. The test sample used in the above-described 
method will vary based on the assay format, nature of the detection method and the 

25 tissues, cells or extracts used as the sample to be assayed. Methods for preparing protein 
extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted in order to obtain a sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain 
the necessary reagents to carry out the assays of the present invention. Specifically, the 

30 invention provides a compartment kit to receive, in close confinement, one or more 
containers which comprises: (a) a first container comprising one of the probes or 
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antibodies of the present invention; and (b) one or more other containers comprising one 
or more of the following: wash reagents, reagents capable of detecting presence of a 
bound probe or antibody. 

In detail, a compartment kit includes any kit in which reagents are contained in 
separate containers. Such containers include small glass containers, plastic containers or 
strips of plastic or paper. Such containers allows one to efficiently transfer reagents from 
one compartment to another compartment such that the samples and reagents are not 
cross-contaminated, and the agents or solutions of each container can be added in a 
quantitative fashion from one compartment to another. Such containers will include a 
container which will accept the test sample, a container which contains the antibodies 
used in the assay, containers which contain wash reagents (such as phosphate buffered 
saline, Tris-buffers, etc.), and containers which contain the reagents used to detect the 
bound antibody or probe. Types of detection reagents include labeled nucleic acid probes, 
labeled secondary antibodies, or in the alternative, if the primary antibody is labeled, the 
enzymatic, or antibody binding reagents which are capable of reacting with the labeled 
antibody. One skilled in the art will readily recognize that the disclosed probes and 
antibodies of the present invention can be readily incorporated intp one of the established 
kit formats which are well known in the art. 

4.17 MEDICAL IMAGING 

The novel polypeptides and binding partners of the invention are useful in 
medical imaging of sites expressing the molecules of the invention (e.g., where the 
polypeptide of the invention is involved in the immune response, for imaging sites of 
inflammation or infection). See, e.g., Kunkel et al., U.S. Pat. NO. 5,413,778. Such 
methods involve chemical attachment of a labeling or imaging agent, administration of 
the labeled polypeptide to a subject in a pharmaceutically acceptable carrier, and imaging 
the labeled polypeptide in vivo at the target site. 

4.18 SCREENING ASSAYS 

Using the isolated proteins and polynucleotides of the invention, the present 
invention further provides methods of obtaining and identifying agents which bind to a 
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polypeptide encoded by an ORF corresponding to any of the nucleotide sequences set 
forth in SEQ ID NOs: 1 - 438, or bind to a specific domain of the polypeptide encoded by 
the nucleic acid. In detail, said method comprises the steps of: 

(a) contacting an agent with an isolated protein encoded by an ORF of the 
5 present invention, or nucleic acid of the invention; and 

(b) determining whether the agent binds to said protein or said nucleic acid. 
In general, therefore, such methods for identifying compounds that bind to a 

polynucleotide of the invention can comprise contacting a compound with a 
polynucleotide of the invention for a time sufficient to form a polynucleotide/compound 

10 complex, and detecting the complex, so that if a polynucleotide/compound complex is 
detected, a compound that binds to a polynucleotide of the invention is identified. 

likewise, in general, therefore, such methods for identifying compounds that bind 
to a polypeptide of the invention can comprise contacting a compound with a polypeptide 
of the invention for a time sufficient to form a polypeptide/compound complex, and 

15 detecting the complex, so that if a polypeptide/compound complex is detected, a 
compound that binds to a polynucleotide of the invention is identified. 

Methods for identifying compounds that bind to a polypeptide of the invention 
can also comprise contacting a compound with a polypeptide of the invention in a cell for 
a time sufficient to form a polypeptide/compound complex, wherein the complex drives 

20 expression of a receptor gene sequence in the cell, and detecting the complex by 

detecting reporter gene sequence expression, so that if a polypeptide/compound complex 
is detected, a compound that binds a polypeptide of the invention is identified 

Compounds identified via such methods can include compounds which modulate 
the activity of a polypeptide of the invention (that is, increase or decrease its activity, 

25 relative to activity observed in the absence of the compound). Alternatively, compounds 
identified via such methods can include compounds which modulate the expression of a 
polynucleotide of the invention (that is, increase or decrease expression relative to 
expression levels observed in the absence of the compound). Compounds, such as 
compounds identified via the methods of the invention, can be tested using standard 

30 assays well known to those of skill in the art for their ability to modulate 
activity/expression. 
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The agents screened in the above assay can be, but are not limited to, peptides, 
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents can be 
selected and screened at random or rationally selected or designed using protein modeling 
techniques. 

5 For random screening, agents such as peptides, carbohydrates, pharmaceutical 

agents and the like are selected at random and are assayed for their ability to bind to the 
protein encoded by the ORF of the present invention. Alternatively, agents may be 
rationally selected or designed. As used herein, an agent is said to be "rationally selected 
or designed" when the agent is chosen based on the configuration of the particular 

10 protein. For example, one skilled in the art can readily adapt currently available 

procedures to generate peptides, pharmaceutical agents and the like, capable of binding to 
a specific peptide sequence, in order to generate rationally designed antipeptide peptides, 
for example see Hurby et al., Application of Synthetic Peptides: Antisense Peptides," In 
Synthetic Peptides, A User's Guide, W.H. Freeman, NY (1992), pp. 289-307, and 

15 Kaspczak et al., Biochemistry 28:9230-8 (1989), or pharmaceutical agents, or the like. 
In addition to the foregoing, one class of agents of the present invention, as 
broadly described, can be used to control gene expression through binding to one of the 
ORFs or EMFs of the present invention. As described above, such agents can be 
randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a 

20 skilled artisan to design sequence specific or element specific agents, modulating the 
expression of either a single ORF or multiple ORFs which rely on the same EMF for 
expression control. One class of DNA binding agents are agents which contain base 
residues which hybridize or form a triple helix formation by binding to DNA or RNA. 
Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or 

25 can be a variety of sulfhydryl or polymeric derivatives which have base attachment 
capacity. 

Agents suitable for use in these methods preferably contain 20 to 40 bases and are 
designed to be complementary to a region of the gene involved in transcription (triple 
helix - see Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 241:456 
30 (1988); and Dervan et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - 
Okano, J. Neurochem. 56:560 (1991); Oligodeoxynucleotides as Antisense Inhibitors of 
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Gene Expression, CRC Press, Boca Raton, EL (1988)), Triple helix-formation optimally 
results in a shut-off of RNA transcription from DNA, while antisense RNA hybridization 
blocks translation of an mRNA molecule into polypeptide. Both techniques have been 
demonstrated to be effective in model systems. Information contained in the sequences 
5 of the present invention is necessary for the design of an antisense or triple helix 
oligonucleotide and other DNA binding agents. 

Agents which bind to a protein encoded by one of the ORFs of the present 
invention can be used as a diagnostic agent. Agents which bind to a protein encoded by 
one of the ORFs of the present invention can be formulated using known techniques to 
10 generate a pharmaceutical composition. 

4.19 USE OF NUCLEIC ACIDS AS PROBES 

Another aspect of the subject invention is to provide for polypeptide-specific 
nucleic acid hybridization probes capable of hybridizing with naturally occurring 

15 nucleotide sequences. The hybridization probes of the subject invention may be derived 
from any of the nucleotide sequences SEQ ID NOs: 1 - 438. Because the corresponding 
gene is only expressed in a limited number of tissues, a hybridization probe derived from 
of any of the nucleotide sequences SEQ ID NOs: 1 - 438 can be used as an indicator of 
die presence of RNA of cell type of such a tissue in a sample. 

20 Any suitable hybridization technique can be employed, such as, for example, in 

situ hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188 
provides additional uses for oligonucleotides based upon the nucleotide sequences. Such 
probes used in PCR may be of recombinant origin, may be chemically synthesized, or a 
mixture of both. The probe will comprise a discrete nucleotide sequence for the detection 

25 of identical sequences or a degenerate pool of possible sequences for identification of 
closely related genomic sequences. 

Other means for producing specific hybridization probes for nucleic acids include 
the cloning of nucleic acid sequences into vectors for the production of mRNA probes. 
Such vectors are known in the art and are commercially available and may be used to 

30 synthesize RNA probes in vitro by means of the addition of the appropriate RNA 
polymerase as T7 or SP6 RNA polymerase and the appropriate radioactively labeled 
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nucleotides. The nucleotide sequences may be used to construct hybridization probes for 
mapping their respective genomic sequences. The nucleotide sequence provided herein 
may be mapped to a chromosome or specific regions of a chromosome using well known 
genetic and/or chromosomal mapping techniques. These techniques include in situ 
5 hybridization, linkage analysis against known chromosomal markers, hybridization 
screening with libraries or flow-sorted chromosomal preparations specific to known 
chromosomes, and the like. The technique of fluorescent in situ hybridization of 
chromosome spreads has been described, among other places, in Verma et al (1988) 
Human Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY. 

10 Fluorescent in situ hybridization of chromosomal preparations and other physical 

chromosome mapping techniques may be correlated with additional genetic map data. 
Examples of genetic map data can be found in the 1994 Genome Issue of Science 
(265:1981f). Correlation between the location of a nucleic acid on a physical 
chromosomal map and a specific disease (or predisposition to a specific disease) may 

15 help delimit the region of DNA associated with that genetic disease. The nucleotide 
sequences of the subject invention may be used to detect differences in gene sequences 
between normal, carrier or affected individuals. 

4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES 

Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared by, for 
20 example, directly synthesizing the oligonucleotide by chemical means, as is commonly 

practiced using an automated oligonucleotide synthesizer. 

Support bound oligonucleotides may be prepared by any of the methods known to 

those of skill in the art using any suitable support such as glass, polystyrene or Teflon. One 

strategy is to precisely spot oligonucleotides synthesized by standard synthesizers, 
25 Immobilization can be achieved using passive adsorption (Ihouye & Hondo, (1990) J. Clin. 

Microbiol. 28(6) 1469-72); using UV light (Nagata et dL, 1985; Dahlen et al, 1987; 

Morrissey & Collins, (1989) Mol. Cell Probes 3(2) 189-207) or by covalent binding of base 

modified DNA (Keller et al., 1988; 1989); all references being specifically incorporated 

herein. 

30 Another strategy that may be employed is the use of the strong biotin-stieptavidin 

interaction as a linker. For example, Broude et al (1994) Proc. Nad. Acad. Sci. USA 91(8) 
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3072-6, describe the use of biotinylated probes, although these are duplex probes, that are 
immobilized on streptavidin-coated magnetic beads. Streptavidin-coated beads may be 
purchased from Dynal, Oslo. Of course, this same linking chemistry is applicable to coating 
any surface with streptavidin. Biotinylated probes may be purchased from various sources, 
5 such as, e.g., Operon Technologies (Alameda, C A). 

Nunc Laboratories (Naperville, IL) is also selling suitable material that could be 
used Nunc Laboratories have developed a method by which DNA can be covalently bound 
to the microwell surface termed Covalink NH. Covalink NH is a polystyrene surface 
grafted with secondary amino groups (>NH) that serve as bridge-heads for further covalent 
10 coupling. Covalink Modules may be purchased from Nunc Laboratories. DNA molecules 
may be bound to Covalink exclusively at the 5*-end by a phosphoramidate bond, allowing 
immobilization of more than 1 pmol of DNA (Rasmussen et al, (1991) Anal. Biochem. 
198(1)138-42). 

The use of Covalink NH strips for covalent binding of DNA molecules at the 5'-end 
15 has been described (Rasmussen et al., (1991). In this technology, a phosphoramidate bond 
is employed (Chu et al., (1983) Nucleic Acids Res. 11(8) 6513-29). This is beneficial as 
immobilization using only a single covalent bond is preferred. The phosphoramidate bond 
joins the DNA to the Covalink NH secondary amino groups that are positioned at the end 
of spacer arms covalently grafted onto the polystyrene surface through a 2 nm long spacer 
20 arm. To link an oligonucleotide to Covalink NH via an phosphoramidate bond, the 

oligonucleotide terminus must have a 5 f -end phosphate group. It is, perhaps, even possible 
for biotin to be covalently bound to Covalink and then streptavidin used to bind the probes. 

More specifically, the linkage method includes dissolving DNA in water (7.5 ng/ul) 
and denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold 0.1 M 
25 1-methylimidazole, pH 7.0 (l-Melm?), is then added to a final concentration of 10 mM 
l-Melmy. A ss DNA solution is then dispensed into Covalink NH strips (75 ul/weU) 
standing on ice. 

Caibodiimide 0.2 M l-ethyl-3-(3-dimediylaminopn5pyl)^arbodiimide (EDC), 
dissolved in 10 mM l-Melm?, is made fresh and 25 ul added per well. The strips are 
30 incubated for 5 hours at 50°C. After incubation the strips are washed using, e.g., 

Nunc-Immuno Wash; first the wells are washed 3 times, then they are soaked with washing 
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solution for 5 min., and finally they are washed 3 times (where in the washing solution is 0.4 
N NaOH, 0.25% SDS heated to 50°C). 

It is contemplated that a further suitable method for use with the present invention is 
that described in PCT Patent Application WO 90/03382 (Southern & Maskos), incorporated 
5 herein by reference. This method of preparing an oligonucleotide bound to a support 
involves attaching a nucleoside 3-reagent through the phosphate group by a covalent 
phosphodiester link to aliphatic hydroxyl groups carried by the support. The 
oligonucleotide is then synthesized on the supported nucleoside and protecting groups 
removed from the synthetic oligonucleotide chain under standard conditions that do not 

10 cleave the oligonucleotide from the support. Suitable reagents include nucleoside 
phosphoramidite and nucleoside hydrogen phosphorate. 

An on-chip strategy for the preparation of DNA probe for the preparation of DNA 
probe arrays may be employed. For example, addressable laser-activated photodeprotection 
may be employed in the chemical synthesis of oligonucleotides directly on a glass surface, 

15 as described by Fodor et al (1991) Science 251(4995) 767-73, incorporated herein by 

reference. Probes may also be immobilized on nylon supports as described by Van Ness et 
al. (1991) Nucleic Acids Res. 19(12) 3345-50; or linked to Teflon using the method of 
Duncan & Cavalier (1988) Anal. Biochem. 169(1) 104-8; all references being specifically 
incorporated herein. 

20 To link an oligonucleotide to a nylon support, as described by Van Ness et al. 

(1991), requires activation of the nylon surface via alkylation and selective activation of the 
5-amine of oligonucleotides with cyanuric chloride. 

One particular way to prepare support bound oligonucleotides is to utilize the 
light-generated synthesis described by Pease et al., (1994) PNAS USA 91(1 1) 5022-6, 

25 incorporated herein by reference). These authors used current photolithographic techniques 
to generate arrays of immobilized oligonucleotide probes (DNA chips). These methods, in 
which light is used to direct the synthesis of oligonucleotide probes in high-density, 
miniaturized arrays, utilize photolabile 5-protected N-acyl-deoxynucleoside 
phosphoramidites, surface linker chemistry and versatile combinatorial synthesis strategies. 

30 A matrix of 256 spatially defined oligonucleotide probes may be generated in this manner. 
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4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS 

The nucleic acids may be obtained from any appropriate source, such as cDNAs, 
genomic DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC 
inserts, and RNA, including mRNA without any amplification steps. For example, 
5 Sambrook et al. (1989) describes three protocols for the isolation of high molecular weight 
DNA from mammalian cells (p. 9.14-9.23). 

DNA fragments may be prepared as clones in M13, plasmid or lambda vectors 
and/or prepared directiy from genomic DNA or cDNA by PCR or other amplification 
methods. Samples may be prepared or dispensed in multiwell plates. About 100-1000 ng of 

10 DNA samples may be prepared in 2-500 ml of final volume. 

The nucleic acids would then be fragmented by any of the methods known to those 
of skill in the art including, for example, using restriction enzymes as described at 9.24-9.28 
of Sambrook et al. (1989), shearing by ultrasound and NaOH treatment 

Low pressure shearing is also appropriate, as described by Schriefer et al (1990) 

15 Nucleic Acids Res. 18(24) 7455-6, incorporated herein by reference). In this method, DNA 
samples are passed through a small French pressure cell at a variety of low to intermediate 
pressures. A lever device allows controlled application of low to intermediate pressures to 
the cell. The results of these studies indicate that low-pressure shearing is a useful 
alternative to sonic and enzymatic DNA fragmentation methods. 

20 One particularly suitable way for fragmenting DNA is contemplated to be that using 

the two base recognition endonuclease, CVOI, described by Fitzgerald et al. (1992) Nucleic 
Acids Res. 20(14) 3753-62. These authors described an approach for the rapid 
fragmentation and fractionation of DNA into particular sizes that they contemplated to be 
suitable for shotgun cloning and sequencing. 

25 The restriction endonuclease CvfJI normally cleaves the recognition sequence 

PuGCPy between the G and C to leave blunt ends. Atypical reaction conditions, which alter 
the specificity of this enzyme (Cvtfl**), yield a quasi-random distribution of DNA 
fragments form the small molecule pUC19 (2688 base pairs). Fitzgerald et al. (1992) 
quantitatively evaluated the randomness of this fragmentation strategy, using a CviJI** 

30 digest of pUC19 that was size fractionated by a rapid gel filtration method and directly 
ligated, without end repair, to a lac Z minus M13 cloning vector. Sequence analysis of 76 
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clones showed that CvOI** restricts pyGCPy and PuGCPu, in addition to PuGCPy sites, and 
that new sequence data is accumulated at a rate consistent with random fragmentation. 

As reported in the literature, advantages of this approach compared to sonication and 
agarose gel fractionation include: smaller amounts of DNA are required (0.2-0.5 ug instead 
5 of 2-5 ug); and fewer steps are involved (no pieligation, end repair, chemical extraction, or 
agarose gel electrophoresis and elution are needed 

Irrespective of the manner in which the nucleic acid fragments are obtained or 
prepared, it is important to denature the DNA to give single stranded pieces available for 
hybridization. This is achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. 
10 The solution is then cooled quickly to 2°C to prevent renaturation of the DNA fragments 
before they are contacted with the chip. Phosphate groups must also be removed from 
genomic DNA by methods known in the art. 

4.22 PREPARATION OF DNA ARRAYS 

Arrays may be prepared by spotting DNA samples on a support such as a nylon 

15 membrane. Spotting may be performed by using arrays of metal pins (the positions of which 
correspond to an array of wells in a microtiter plate) to repeated by transfer of about 20 nl of 
a DNA solution to a nylon membrane. By offset printing, a density of dots higher than the 
density of the wells is achieved. One to 25 dots may be accommodated in 1 mm 2 , 
depending on the type of label used. By avoiding spotting in some preselected number of 

20 rows and columns, separate subsets (subarrays) may be formed. Samples in one subarray 
may be the same genomic segment of DNA (or the same gene) from different individuals, or 
may be different, overlapped genomic clones. Each of the subarrays may represent replica 
spotting of the same samples. In one example, a selected gene segment may be amplified 
from 64 patients. For each patient, the amplified gene segment may be in one 96-well plate 

25 (all 96 wells containing the same sample). A plate for each of the 64 patients is prepared. By 
using a 96-pin device, all samples may be spotted on one 8 x 12 cm membrane. Subarrays 
may contain 64 samples, one from each patient. Where the 96 subarrays are identical, the 
dot span may be 1 mm 2 and there may be a 1 mm space between subarrays. 

Another approach is to use membranes or plates (available from NUNC, Naperville, 

30 Illinois) which may be partitioned by physical spacers e.g. a plastic grid molded over the 
membrane, the grid being similar to the sort of membrane applied to the bottom of multiwell 
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plates, or hydrophobic strips. A fixed physical spacer is not preferred for imaging by 
exposure to flat phosphor-storage screens or x-ray films. 

The present invention is illustrated in the following examples. Upon consideration 
of the present disclosure, one of skill in the art will appreciate that many other embodiments 
and variations may be made in the scope of the present invention. Accordingly, it is 
intended that the broader aspects of the present invention not be limited to the disclosure of 
the following examples. The present invention is not to be limited in scope by the 
exemplified embodiments which are intended as illustrations of single aspects of the 
invention, and compositions and methods which are functionally equivalent are within the 
scope of the invention. Indeed, numerous modifications and variations in the practice of the 
invention are expected to occur to those skilled in the art upon consideration of the present 
preferred embodiments. Consequently, the only limitations which should be placed upon 
the scope of the invention are those which appear in the appended claims. 

All references cited within the body of the instant specification are hereby 
incorporated by reference in their entirety. 

5.0 EXAMPLES 

5.1 EXAMPLE 1 

Novel Nucleic Acid Sequences Obtained From Various Libraries 

A plurality of novel nucleic acids were obtained from cDNA libraries prepared from 
various human tissues and in some cases isolated from a genomic library derived from 
human chromosome using standard PCR, SBH sequence signature analysis and Sanger 
sequencing techniques. The inserts of the library were amplified with PCR using primers 
specific for the vector sequences which flank the inserts. Clones from cDNA libraries were 
spotted on nylon membrane filters and screened with oligonucleotide probes (e.g., 7-mers) 
to obtain signature sequences. The clones were clustered into groups of similar or identical 
sequences. Representative clones were selected for sequencing. 

In some cases, the 5 f sequence of the amplified inserts was then deduced using a 
typical Sanger sequencing protocol. PCR products were purified and subjected to 
fluorescent dye terminator cycle sequencing. Single pass gel sequencing was done using a 
377 Applied Biosystems (ABI) sequencer to obtain the novel nucleic acid sequences. In 



118 



WO 02/081731 



PCT/US02/01222 



some cases RACE (Random Amplification of cDNA Ends) was performed to further extend 
the sequence in the 5' direction. 

5.2 EXAMPLE 2 
Novel Nucleic Acids 

5 The novel nucleic acids of the present invention of the invention were assembled 

from sequences that were obtained from a cDNA library by methods described in Example 
1 above, and in some cases sequences obtained from one or more public databases. The 
nucleic acids were assembled using an EST sequence as a seed. Then a recursive algorithm 
was used to extend the seed EST into an extended assemblage, by pulling additional 

10 sequences from different databases (i.e., Hyseq's database containing EST sequences, 
dbEST version 119, gb pri 119, and UniGene version 119) that belong to this assemblage. 
The algorithm terminated when there was no additional sequences from the above databases 
that would extend the assemblage. Inclusion of component sequences into the assemblage 
was based on a BLASTN hit to the extending assemblage with BLAST score greater than 

15 300 and percent identity greater than 95%. 

Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA 
sequence and its corresponding protein sequence were generated from the assemblage. Any 
frame shifts and incorrect stop codons were corrected by hand editing. During editing, the 
sequence was checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 

20 120, gb pri 120, UniGene version 120, Genpept release 120). Other computer programs 
which may have been used in the editing process were phredPhrap and Consed (University 
of Washington) and ed-ready, ed-ext and cg-zip-2 (Hyseq, Inc.). The full-length nucleotide 
and amino acid sequences, including splice variants resulting from these procedures are 
shown in the Sequence Listing as SEQ ID NOS: 1- 438. 

25 Table 1 shows the various tissue sources of SEQ ID NO: 1-438. 

The nearest neighbor results for polypeptides encoded by SEQ ID NO: 1-438 
were obtained by a BLASTP (version 2.0al 19MP-WashU) search against Genpept, 
Geneseq and S wissProt databases using BLAST algorithm. The nearest neighbor result 
showed the closest homologue with functional annotation for SEQ ID NO: 1-438. The 

30 translated amino acid sequences for which the nucleic acid sequence encodes are shown 
in the Sequence Listing. The homologues with identifiable functions for SEQ ID NO: 1- 
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438 are shown in Table 2 below.Using eMatrix software package (Stanford University, 
Stanford, CA) (Wu et aL, J. Comp. Biol, Vol. 6 pp. 219-235 (1999) herein incorporated 
by reference), all the sequences were examined to determine whether they had 
identifiable signature regions. Table 3 shows the signature region found in the indicated 

5 polypeptide sequences, the description of the signature, the eMatrix p-value(s) and the 
position(s) of the signature within the polypeptide sequence. 

Using the Pfam software program (Sonnhammer et aL, Nucleic Acids Res., Vol. 
26(1) pp. 320-322 (1998) herein incorporated by reference) polypeptides encoded by 
SEQ ID NO: 1-438 (i.e. SEQ ID NO: 1-438) were examined for domains with homology 

10 to certain peptide domains. Table 4 shows the name of the domain found, the 

description, the product of all the e- value of similar domains found, the pFam score for 
the identified domain within the sequence, number of similar domains found, and the 
position of the domain in the SEQ ID NO: being interrorgated. 

The GeneAtlas™ software package (Molecular Simulations Inc. (MSI), San 

15 Diego, CA) was used to predict the three-dimensional structure models for the 

polypeptides encoded by SEQ ID NO: 1-438 (i.e. SEQ ID NO: 1-438). Models were 
generated by (1) PSI-BLAST which is a multiple alignment sequence profile-based 
searching developed by Altschul et al, (Nucl. Acids. Res. 25, 3389-3408 (1997)), (2) 
High Throughput Modeling (HTM) (Molecular Simulations Inc. (MSI) San Diego, CA,) 

20 which is an automated sequence and structure searching procedure 

flittp://www.msi.com/L and (3) SeqFold™ which is a fold recognition method described 
by Fischer and Eisenberg (J. Mol. BioL 209, 779-791 (1998)). This analysis was carried 
out, in part, by comparing the polypeptides of the invention with the known NMR 
(nuclear magnetic resonance) and x-ray crystal three-dimensional structures as templates. 

25 Table 5 shows, "PDB ID", the Protein DataBase (PDB) identifier given to template 
structure; "Chain ID", identifier of the subcomponent of the PDB template structure; 
"Compound Information", information of the PDB template structure and/or its 
subcomponents; "PDB Function Annotation" gives function of the PDB template as 
annotated by the PDB files (http:/www.resb.org/PDB/) : start and end amino acid position 

30 of the protein sequence aligned; PSI-BLAST score, the verify score, the SeqFold score, 
and the Potential(s) of Mean Force (PMF). The verify score is produced by GeneAtlas™ 
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software (MSI), is based on Dr. Eisenberg's Profile-3D threading program developed in 
Dr. David Eisenberg's laboratory (US patent no. 5,436,850 and Luthy, Bowie, and 
Eisenberg, Nature, 356:83-85 (1992)) and a publication by R. Sanchez and A. Sali, Proc. 
Natl. Acad Sci. USA, 95:13597-12502. The verify score produced by GeneAtlas 
5 normalizes the verify score for proteins with different lengths so that a unified cutoff can 
be used to select good models as follows: 

Verify score (normalized) = (raw score - 1/2 high score)/(l/2 high score) 

10 The PFM score, produced by GeneAtlas™ software (MSI), is a composite scoring 

function that depends in part on the compactness of the model, sequence identity in the 
alignment used to build the model, pairwise and surface mean force potentials (MFP). As 
given in Table 5, a verify score between 0 to 1.0, with 1 being the best, represents a good 
model. Similarly, a PMF score between 0 to 1.0, with 1 being the best, represents a good 

15 model. A SeqFold™ score of more than 50 is considered significant. A good model may 
also be determined by one of skill in the art based all the information in Table 5 taken in 
totality. 

The nucleotide sequence within the sequences that codes for signal peptide 
sequences and their cleavage sites can be determined from using Neural Network SignalP 

20 Vl.l program (from Center for Biological Sequence Analysis, The Technical University 
of Denmark). The process for identifying prokaryotic and eukaryotic signal peptides and 
their cleavage sites are also disclosed by Henrik Nielson, Jacob Engelbrecht, Soren 
Brunak, and Gunnar von Heijne in the publication " Identification of prokaryotic and 
eukaryotic signal peptides and prediction of their cleavage sites" Protein Engineering, 

25 Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by reference. A maximum S score and 
a mean S score, as described in the Nielson et as reference, was obtained for the 
polypeptide sequences. Table 6 shows the position of the signal peptide in each of the 
polypeptides and the maximum score and mean score associated with that signal peptide. 
Table 7 correlates each of SEQ ID NO: 1-438 to a specific chromosomal location. 

30 Table 8 is a correlation table of the novel polynucleotide sequences SEQ ID NO: 

1-438, novel polypeptide sequences SEQ ID NO: 1-438, and their corresponding priority 
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nucleotide sequences in the priority application USSN 09/774,528, herein incorporated 
by reference in its entirety. 
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Table 1 



Tissue Origin 


RNA/ Tissue 


Library 


SEQ ID NO: 




Source 


Name 




adult brain 


GIBCO 


AB3001 


76-77 91 106-107 115 134 163-164 178 203 








232 255 265 276 279 322-323 


adult brain 


GIBCO 


ABD003 


16 19 24 77 80-81 85 89-90 92 96 98 105 








110 116 121-123 125 130-132 134-136 138 








142-143 151 153 158-159 163-164 184 191 








193 196 198 200 208-209 213-214 216 219- 








220 223 229 232-234 236 239 241 243 257- 








259 262 265 267 274-276 278 284 292 302 








317 321 324-325 327 337-338 340 348 359 








371 391-392 400 


. adult brain 


Clontech 


ABR001 


1 18-19 35 80 98 125 136 153 185 200 209 








221 228-229 239 243 274-275 302 399-400 


adult brain 


Clontech 


ABR0065 


7-8 18 32 35 52 57 85 91 96 111 113 126 








131 135 138-139 142 148 153-154 181 188 








192 199 209-211 217 221 224 226 229 233 








235 238 243 248 273 283-284 286 292 316 








322 348 357 361 367 376 378 399 407 409 








417 428 


adult brain \ 


Clontech 


ABR008 


2 4 6-11 19-21 23-25 31 35-37 39-41 45-46 








72-73 76 80-81 85 88-90 94-95 97 102-105 








109 111-112 114-119 121-122 126-131 134- 








135 138-139 144 146-150 152-153 156-157 








159 168-172 174-175 178 180 182 185-186 








189-190 194 196 198-201 203 205-210 217 








219 221-222 224 229-230 232-233 236-239 








243-244 248 253-256 260-261 263-265 273 








<\r* e ooi «-> o o o o /r o o ft o o *i ft n o ft ft ft n /\n ft ft 

276 281-282 286-289 291-292 299-300 302 








304 315-317 319 321-322 324 326 329 331- 








332 341 352-357 360 362 365 367-368 370 








376-377 379-380 383-384 387-389 391-392 








394 396-402 407-410 412-413 419 425-426 








A 1 O 

4 J J 


adult brain 


Clontech 


ABR011 


85 90 


adult brain 


BioChain 


ABR012 


148 213 


adult brain 


BioChain 


ABR013 


85 322 


adult brain 


Invitrogen 


ABR014 


9 23 85 146 200 233 282 321 330 i 


adult brain 


Invitrogen 


ABR015 


14 31 69 121 124 163 209 216 224 291 377 


adult brain 


Invitrogen 


ABR016 


92 136 219 279 


adult brain 


Invitrogen 


ABT004 


2 7-8 20-21 33 85 90-91 95 97 102-103 108 | 








121 123 129-131 138-139 143 146 151 153 








157-158 172 178 180 209-210 213 219 229- 








230 232 234 239 308 321 330 360 365 370- 








373 375 401 412 


adipocytes 


Stratagene 


ADP001 


3-4 23 36 79 81 106-107 116 129 133-134 








147 151 154 158 179 181 192 196 222 230 








256-257 287 292 297 313 329 359 


adrenal gland 


Clontech 


ADR002 


2 25 27 33 57 76 85-86 88 96 98 105-108 








114 121-122 125 129-130 134 147 164 178 








180 182 198-199 201 205 207-208 240-241 








244 246 253-254 257 261 276 280 292 320 








329 336 352 403 


adult heart 


GIBCO 


AHR001 


3 17-21 27 32 74 76 85 89-91 95-96 102-103 








105-110 117 121 124-125 128 131 134-136 








139 141 148 151-153 155-156 161 163 181- 








182 186 190 193 198 200-201 205 207 211- 








213 215 222 
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Table 1 



Tissue Origin 


RNA/Tissue 


Library 


SEQ 


ID NO: 


















Source 


Name 




























225 


229- 


230 


234 


251- 


254 


257- 


-259 


263 


274- 








277 


280 


292- 


297 


301 


303- 


•304 


315 


-316 


319 








329- 


-331 


345 


359 


384 


417 


423- 


-424 






adult kidney 


Invitrogen 


AKT002 


3 6 


14 20-21 25 


-26 76 79 85 


89 94 101 111 








114 


118 


121 


124 


126 


130- 


•131 


138 


146 


163 








170 


177- 


•178 


189 


196 


198 


201 


204 


213 


231 








253- 


-254 


256- 


259 


271 


273-275 


277 


298 


315 








320 


329 


342 
















adult lung 


GIBCO 


ALG001 


4 29 74 


79 85 90 96 


105 


111 


119 


132 


134 








136 


142 


144 


149 


159 


181 


189 


198 


200 


205- 








207 


226 


255 


257 


263 


283 


294 


300 


302- 


-303 








328 


358- 


359 


365 


426 












lymph node 


Clontech 


ALN001 


6 16 31 


105 


120 


215 


257 


295 


306 


309 


359 


young liver 


GIBCO 


ALV001 


10-11 25-26 


29 


31 33 


76 


85 95 115 121-122 








124 


126 


130 


143 


146 


156 


158 


164 


178 


182 








187 


189 


229 


248 


253- 


254 


261 


278 


283 


304 








342 


375 


















adult liver 


Invitrogen 


ALV002 


10- 


L2 22 


26 


31 


33-34 


38 


53 


56 90-92 


94-95 








118 


121 


124 


128 


-129 


138 


141 


146 


148 


153 








156 


161 


171 


178 


198 


216 


232 


248 


253- 


-254 








256 


-257 


264 


302 


306 


365 


375 


383 


396 




adult liver 


Clontech 


ALV003 


10- 


11 156 171 188 












Ovary 


Invitrogen 


AOV001 


3-8 


10-11 14 16 


19-22 24 27 


-31 34 36: 57 73 








75- 


76 81-82 


85 


89-91 


94- 


-98 


104- 


109 111 








115 


-116 


121- 


•128 


130- 


131 


134 


136 


138- 


-139 








141 


143- 


144 


146 


149- 


150 


152 


155 


157- 


-160 








163 


-166 


170- 


•173 


175 


177- 


-178 


180 


182 


184- 








187 


189- 


•190 


193 


-194 


196- 


-197 


200 


-201 


212- 








213 


215 


217 


222 


225- 


226 


228 


230 


-233 


235 








241 


-243 


245 


248 


253- 


259 


261 


266 


-267 


270 








272 


-273 


276- 


-278 


283- 


285 


287 


289 


292 


297- 








299 


305- 


•306 


315 


-317 


319 


323 


-325 


329- 


-331 








341 


343- 


•344 


352 


358- 


359 


363 


-366 


382- 


-383 








386 


389- 


-390 


412 














. 

Placenta 


Invitrogen 




APL001 


73 
359 


92 117 135 182 194 232 246 261 272 282 


placenta 


Invitrogen 


APL002 


16 


28 92 121 13 


5 144 


157 178 210 394 


adult spleen 


GIBCO 


ASP001 


3-4 


16 32-33 


35 


90 96 99-100 123-125 128 








131 


134 


136 


139 


151 


178 


181 


189 


194 


200 








210 


218 


229 


251 


253- 


•255 


257 


276 


283 


307- 








309 


315 


329 


354 


-355 


357 


392 


400 




1 


testis 


GIBCO 


ATS001 


22 


73 82 91 


96- 


97 104-105 117 124 130 134 








164 


173 


200 


209 


222 


233 


241 


253 


-254 


257 








285 


287- 


-288 


305 


325 


329 


351 


-353 


359 




bladder 


Invitrogen 


BLD001 


4 108 130 150 212 226 236 240 242 257 276 








287 


305 


395- 


-396 


415 












bone marrow 


Clontech 


BMD001 


1 4 


-5 22 29- 


-30 


34 72 


85 


88 


90 92 94 


98 








104 


-107 


109 


111 


113 


117 


120 


123 


-125 


128- 








129 


132 


135 


140 


142 


144 


146 


152 


163 


165- 








166 


170- 


-173 


177 


180 


182 


186 


189 


-190 


198- 








209 


215 


222 


225 


232 


240- 


-246 


251 


-252 


260- 








261 


273- 


-275 


277 


-280 


283- 


-285 


300 


316 


318 








346 


-347 


359 
















bone marrow 


GF 


BMD002 


1 4 


7-8 


10-11 16 19 


25 31 49 61 


-62 72 74 








76 


80 85 88 


90 


93-95 97- 


-101 


109 


-110 


112 








114 


116- 


-117 


121 


126 


129 


132 


135 


141 


144 
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Table 1 



Tissue Origin 


RNA/Tissue 


Library 


SEQ ID NO: 




Source 


Name 










146 149-150 154 157 160 162-163 165-166 








170-172 175 178-180 182-rl83 186-190 192- 








194 198-200 203 208 210-213 215 223 225 








234 242 245 247 251-254 256-257 265 270 








273 276-278 280 285 287 289 291 293-294 








299 302 307 309 315 322 324 337-338 353 








356-357 359 367 369 388 407 414 419 426 








434 


bone marrow 


Clone tech 


BMD007 


144 J 


♦Mixture of 


VARIOUS 


CGdOlO 


1 34-35 95 152 161 171 182 206 219 242 260 


16 tissues - 


VENDORS 




267 276 280 288 297 300 315-316 412 


mRNA 








♦Mixture of 


Various 


CGdOll 


45 51 167 188 216 251-252 


16 tissues - 


Vendors 






mRNA 








♦Mixture of 


Various 


CGd012 


2 10-11 18-21 29 31 34-35 40 42-43 45 48 ! 


16 tissues - 


Vendors 




50-52 69-71 87-89 94-95 98-105 109 111-113 


mRNA 






117 120 123 125 127 131 135-136 138 146 








158 163 165-169 175 180 187-188 191 198 








201 208 216 219-221 224 226 234 236 238- 








239 241-246 251-252 260 264 270 276-277 








279 281 283-284 287 295-296 314 319 321 








327-328 331 333-334 337-341 343 351-352 








361 365 369 379-380 387 389 395 397-399 








402 406 410-412 417 419 424 426 431-433 


♦Mixture of 


Various 


CGd013 


29 48 101 146 167-169 187 219 234 327 333 


16 tissues - 


Vendors 




339 341 365 412 433 


mRNA 








♦Mixture of 


Various 


CGd015 


29 86 90 95 98 110 113 118 132 158 171 184 


16 tissues - 


Vendors 




193 218^220 243 284 310 385 410 419 


mRNA 








♦Mixture of 


Various 


CGd016 


3-4 20-21 29 38 85 88-89 95 105 119 122 


16 tissues - 


Vendors 




131-133 140 185 211-212 225 256-257 273 


mRNA 






276 302 318 379-380 390 400 419 


colon 


Invitrogen 


CLN001 


4 25 33 85 138 146 148 158-159 198 210 229 








301 360 384 397 


cervix 


BioChain 


CVX001 


3 5 10-11 18 20-21 24-25 29 36 41 47 57 63 








72 74 76 86 90 94 104 108-109 111 125 127 








130 134 138 144 147 162 174 178-179 182 








186 189 193 197 211 222 .225-226 228 232 








241 243 257 261 267 270 273-275 278-281 








288-289 298 301-302 305 315 319 324-325 








329 331 337-338 359 391-392 395 420 


endo t he 1 1 al 


Strategene 


EDT001 


1 c 10 in O A 0*"7 OO *3 C TO 1 c to on QC OQ DC 

3-0 18-19 24 6d 7Z /o /9-oU qd o3 20 


cells 






98 104-107 111 117 119-121 124-131 134 136 








138-139 141 144 146-147 149 152 158-159 








166-167 170-173 178-179 182-183 186-187 








191 193-194 196-197 200 210-211 222-224 








226 231-232 236 241 243 246 248 253-256 








258-259 276 279 282 287 292 300 302-303 








315 329 337-338 358-362 382-383 385-388 


esophagus 


BioChain 


ESO002 


257 


fetal brain 


Clontech 


FBR001 


34 


fetal brain 


Clontech 


FBR004 


3 139 144 271 284 337-338 


fetal brain 


Clontech 


FBR006 


4 6-11 14 18-21 24 28 31 37-38 40 63 76 85 








87 89-90 94-95 97 105 108-109 112-113 115 
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Table 1 



Tissue Origin 



RNA/Tissue 
Sourch 



Library 
Name 



SEQ ID NO: 



117-120 
170 172 
199 201 
232-233 
281 288 
330-331 
380 383 
419 421 



127-130 
175 180 
203 209- 
240 243 
-289 292 
356-357 
389 397 
423 



133 138 
182 186- 
210 215 
245 253- 
295 304 
359-360 
399-401 



140 144 
188 190 
219 222 
255 270 
315 317 
364 367 
408-409 



-146 148 
192 194 
229-230 
273 276 
319 324 
-368 379- 
411 413 



fetal brain 



Invitrogen 



FBT002 



2 14 19 23 28 31 90 94 105 121 124 126 131 

135 139 142 149 158 186 193 198 210 214- 

215 232 239 242 248 255 267 326 332 365 

369 371 376-383 394 399 



fetal heart 



Invitrogen 



FHR001 



4 7-8 10-11 14 17-21 28-29 31-32 60 64-65 
73 85 87 92 95 102-103 105 108 111 113 117 
119 121 125 128-129 134-135 141 152 154 
156-157 160-161 172 176 178 194 196 198- 
200 203 208 212 215 218 222 226 229 233- 
234 253-257 261 265 272 276 281 292-293 
295 303 305 319 325 327 337-338 341 345 
349 354-355 367-368 389 395-396 398 412 
417 436 



fetal kidney 



Clontech 



FKD001 



1 14 22 94 110 115 132 134-135 146 178 189 
199 235-236 242 247 257 267 292 295 359 



fetal kidney 



Clontech 



FKD002 



22 31 38 40 46 94 122 127 131 156 160 194 
198 229 253-254 270 292 303 319 354-355 
389 396 



fetal kidney 



Invitrogen 



FKD007 



303 



fetal lung 



Clontech 



FLG001 



85 89 98-100 111 175 271 281 369 



fetal lung 



Invitrogen 



FLG003 



84 88 106-107 122 135 140 146 160 181 246 
272 284 292 328 330 396 404 416 426 



fetal liver- 
spleen 



Soares 



FLS001 



1-3 6-12 14 19 23 28-31 33 57 59-60 72-76 
78 80 83 85-138 140-141 143-144 146-155 
157-161 163-197 200 204 208 210-211 223 
225 230 232-233 235 241-243 245-266 268- 
273 277 281 285-287 292 297 303 314 329 
343 346-347 357-359 369 397 399 407 415 



fetal liver- 
spleen 



Soares 



FLS002 



1 3-4 6 10-12 23-24 29 31-33 35-37 53-54 
74-76 79 81-82 86-89 91 94-95 99-104 106- 
109 111-112 115 117-120 122 125-126 128- 
129 132 134 136-138 141 146 149 153 157- 
159 162-166 170 172 175 178-180 183 185- 
191 194 196-197 205 207-212 222-225 228 
232-233 239-241 248 251-252 255-256 258- 
259 261-262 264 266-267 270-271 273-275 
277-278 283 285 287 298 305 315 317-318 
322 330-332 337-338 341 343 349 357-360 
365 388 390-391 399 402 418 424 



fetal liver- 
spleen 



Soares 



FLS003 



12 29 91 98 111 119 156 163 165 178 186 
193 210-211 276 286 315 322 346-347 357 
365 424 



fetal liver 



Invitrogen 



FLV001 



7-8 14 35 118 122-123 129 146 182 211 230 
232 248 251-252 264 287 304 337-338 344 
346-347 352 365 367-369 



fetal liver 



Clontech 



FLV002 



102-103 147 149 300 



fetal liver 



Clontech 



FLV004 



73 85 105 108 118 122 126 141 156-157 161 
165 170 178 180 182 194 215 218 225 240 
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Table 1 



Tissue Origin 



RNA/ Tissue 
Source 



Library 



SEQ ZD NO: 



242 247 251-252 292 330 337-338 369 407 
411 440 



fetal muscle 



Invitrogen 



FMS002 



5 9 17-18 20-21 29 38 85 88 97 106-107 129 

131 136 150-152 155 165 170 179 182 192- 

193 212-213 229 234 242 258-259 270 282 

286 289 300 316 319 345 351 354-355 360 

389 396 408 410 437 439 



fetal skin 



Invitrogen 



FSK001 



2 4 7-8 29 33 42-43 49 51-52 58 74 82 85 
90 94 110-111 116 118 121 133 136 138-139 

145 151 154 156-157 161-162 172 181 184 

186 193 198 200 205 207 209-211 222 227- 

230 232 235 240 246 253-257 266 270 276 

292 295 299 316 318 323 330 332 337-340 

343 357 369 389 394-395 412 422 427 



fetal skin 



Invitrogen 



FSK002 



4 9 42 44 51 66 72 81 85 89-90 95 98 105 
112-114 119 121 129 133 135 162 172 179- 
182 197 200 208 210 231 243-244 272 304 
316 330 339 354-355 357 360 389 395 410 
417 437 



fetal spleen 



BioChain 



FSP001 



157 223 



umbilical 
cord 



BioChain 



FUC001 



4-6 20-21 25 29 73-74 83 87 89-91 94 101 
109 120 123 125 128 130-131 133 141 143- 
144 147 149 154 161 165 173 175 179 184 
188 210-212 217 226 235 240 248 251-252 
257 262 267 270 277 293 305 307 316 319 
323 327 331 341 356 359 389 392 407 416 



fetal brain 



GIBCO 



HFB001 



2-4 16 20-21 74 
111 114 118 121 
134 137-140 142 
159 163-164 166 
196 200 203 209 
239 243 253-255 
292 310 316 319 
399 



77 85 89-91 
•122 124-125 

144 146-148 

173 178 180 
-214 216-232 

263 270 272 
-321 332 348 



96-98 104-105 
127-128 131 
151 153 158- 
182 191 194 
234-236 238- 
-273 276 281 
357 359 365 



macrophage 



Invitrogen 



HMP001 



2 247 



infant brain 



Soares 



IB2002 



2-4 7-8 
89 91 96 
122 125 
172-173 
203 208-: 
234 236-: 
273-275 
317 322 
368 376 



19-22 26-27 
-98 106-107 
128-131 134- 
177 180 186- 
210 217 219 
237 239 241- 
278-279 282 
327 330 333- 
379-380 382 



31-32 35 73- 
110 112 118 
144 148 153 
187 191-194 
223-224 227 
243 245 248 
287.294 298 
334 341 348- 
396 406 424 



■74 80 85 
-119 121- 
164 166 
196 202- 
229 232- 
253-259 
309 314 
-350 360 



infant brain 



Soares 



IB2003 



3-4 20-21 26 28 
119 122-123 130 
146 153 155 170 
209 219 223 226 
248 253-254 256 
314 337-338 343 



31 35 73 85 
-131 135 138 

172-173 186 

229 233-234 
-257 273 279 

359 367 371 



95-96 110 113 
140 142-143 
191-193 196 
236 239 245 
291-292 304 
376 397 413 



lung, 

fibroblast 



Strategene 



LFB001 



3 6 31 72-73 90 
136 139 144 146 
246 258-259 268 
434 



92 105-107 
172 189 198 
272 276 282 



124 126-127 133 
204 233 235 
310 335 359 



adult lung 



Invitrogen 



LGT002 



4 19-21 28 33 35-36 49 72 79 81 85 88 90- 
91 94-95 101 106-107 109 118 120-125 127 
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ID NO: 


















Source 


Name 




























130- 


-131 


133 


135- 


-138 


141- 


-142 


144 


147 


149 








157 


159- 


•161 


163 


166 


170 


-173 


193- 


•194 


196- 








197 


212 


216 


218 


221 


223 


226 


228- 


•229 


231 








233 


241 


247-248 


253- 


-255 


257 


261 


266- 


-267 








270- 


-275 


277-278 


282-283 


292 


298 


301 


303 








315 


318 


324 


331 


335 


354- 


-355 


359 


367 


369 








381 


392- 


-393 


398 














leukocytes 


GIBCO 


LUC001 


1-5 


15 19-21 28 


30-33 37 72 


74 91 94-95 








97-100 108-109 


113 115 117 119-122 124-125 








127- 


-128 


134- 


-138 


141 


144 


146 


-148 


150- 


-151 








157- 


-158160 162-167 170-173 175-178 180-181 








187 


189 


192 


194 


197 


200 


212 


-213 


215- 


-216 








218 


-219 


223 


225 


228- 


-232 


241- 


-242 


245- 


-246 








251- 


-254 


261 


272 


-276 


278- 


-282 


284 


2.87- 


-290 








297- 


-298 


305 


307 


310- 


-314 


325 


331 


336 


340 








358- 


-359 


372 


399 


414 












leukocytes 


Clontech 


LUC003 


1 5 


124 


171 


176 


204 


225 


248 


253- 


•254 


283 








285 


307 


315 
















melanoma 


Clontech 


MEL004 


4-5 


24 2 


7 72-74 


81 85 106-107 113 136 177 








203 


205- 


■207 


209 


231 


243 


284 


-285 


315 


-316 








320 


326 


359 


374 


428 












mammary gland 


Invitrogen 


MMG001 


2 4 


-5 7- 


-8 10-12 


29 31 34-35 


38 50 80-81 85 








89-90 92 94- 


-97 


105 108-109 119-124 126 








128- 


-130 


135 


138 


-139 


141 


-142 


144 


146- 


-147 








153 


155 


157 


-159 


163 


178 


-179 


181- 


182 


198 








200 


209- 


•210 


219 


223 


228 


230 


232- 


•233 


235- 








236 


239 


242 


248 


253- 


-255 


257 


260- 


•261 


265- 








267 


270 


272 


281 


287 


292 


294 


315- 


316 


318 








324 


327 


330 


337 


-340 


354 


-355 


357 


369 


372 








383 


392- 


■395 


401 


404 












ncLiroij 


o Lia ueyene 


wmnni 


35 47 89-90 


111 


118 


164 


232 


253- 


•254 


276 








324 


331 


382 
















neuron 


otraceyene 


JMiKUU x 


20-21 37 122 147-149 170 179 181 186 212 








226 


258- 


-259 


265 


276 


369 


436 


438 






neuronal 


ocrategene 


JMl uuux 


7-8 


37 55 80 85 


112 


118 


126 


-127 


133 


138 


cells 






140 


-141 


151 


170 


181 


210 


214 


225- 


•226 


236 








243 


287 


328 


330 


-331 


357 


383 


400 


436 




pi tui tary 


Liontecn 




92 124 159 231 














gland. 


























placenta 


UJ.ont.ecn 


Tit A A AO 


34 


46 88 126 128 159 182 186 197 201 267 








278 


281- 


-282 


305 


330 


356 


361 


365 


418 




prostate 


Clontech. 


nnmA A *| 
PKIUU L 


18 


36 72 74 


86 


95 106-107 111 118 122 144 








161 


179 


211 


218 


233 


286 


297 










Tnvi t* T-nrr^ri 
juii v j. c n 


RRP001 


9 31 85 


121 


128 


147 


171 


200 


219 


257 


292 








340 


394 


398 


407 


412 












salivary 


Clontech 


SAL001 


3 24 38 


80 122 


136 


147 


189 


241 282 296 310 


gland 






351 


392 


395 


415 














saliva gland 


Clontech 


SALS 03 


118 


small 


Clontech 


SIN001 


12 


16 25 82 


-83 


B9-90 93 


95 


98 105-109 111 


intestine 






122 


-123 


125 


-128 


133 


-134 


137 


139 


142 


161 








167 


171 


184 


197 


201 


204 


212 


218 


236 


242- 








243 


248- 


-249 


253 


-254 


257 


267 


276 


284 


-285 








292 


297 


300 


303 


310 


313 


317 


-318 


325 


340 








343 


352 


354 


-355 


359 


383 


391 


416 






spinal cord 


Clontech 


SPC001 


3 39 84 


86 


94 9 


6 105 115 117 130-131 134 








136 


141 


143 


148 


155 


176 


190 


-191 


203 


213 



128 
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Source 


Name 




























224 


233- 


-234 


236 


239 


279 


283 


298 


320- 


-321 








332 


336- 


-338 


356 


359 


365 


404- 


406 






thalamus 


Clontech 


THAG02 


2 20-21 


23 74 81 


85 


105- 


-106 


116 


121 


131 








146 


171 


185 


188 


200 


209 


219 


233 


239 


256 








258- 


-259 


273 


276 


362 


399 










thymus 


Clone tech 


THM001 


16 29 33 57 


80 82 85 90 


93-94 106-107 120 






126 


128 


134 


141 


161 


176 


194 


223 


228 


235 








253- 


-254 


261 


274- 


•275 


278 


285 


298 


319 


332 








336 


343 


353 


359 


425 












thymus 


Clontech 


THMC02 


1-2 


7-9 


14 


26 34 44 


73 75 82 85 


87 94 98 






106 


-107 


109 


-111 


117 


119 


-120 


125- 


-126 


128- 








129 


139 


141 


144 


147- 


-148 


151 


154- 


-155 


162 








165 


170- 


-172 


175- 


■176 


179 


182 


186 


193 


-194 








199 


-200 


208 


-209 


213 


218 


233 


235 


240 


242 








247 


253- 


-254 


257 


265 


276 


281 


287 


290 


305 








307 


312 


319 


336 


342 


354 


-356 


359 


364 


367 








399 


408 


412 


-413 


415 


419 


421 


426 


429 


-433 


thyroid gland 


Clontech 


THRO 01 


3 5 


7-8 


28 


30-31 33 


73-77 80 82 


85 


B8 90- 






92 


94 96-98 


105- 


-107 


109 


113 


117 


121 


-122 








124 


-125 


127 


-128 


130 


134 


136 


141 


143 


146- 








148 


152 


161 


-163 


166 


175 


177- 


-178 


181 


194 








199 


201 


204 


210 


212 


216 


218 


223 


-226 


228 








230 


-231 


234 


236 


241 


243 


246 


253 


-257 


261 








270 


272 


-273 


276- 


: 278 


281 


-283 


287 


292 


295 








298 


303 


-304 


308 


315 


323 


329 


335 


352 


359 








362 


401 


416 


-417 














trachea 


Clontech 


TRC001 


88 


138 


180 


226 228 279 


359 411 


436 




uterus 


Clontech 


UTR001 


3 10-11 


23 


77 92 106-107 109 111 141 197- 








198 


218 


241 


257 


270 


274 


-275 


302 


315 


329 








396 


400 


413 

















*The 16 tissue/mRNAs and their vendor sources are as follows: 1) Normal adult brain 
mRNA (Invitrogen), 2) Normal adult kidney mRNA (Invitrogen), 3) Normal fetal brain mRNA 
(fiivitrogen), 4) Normal adult liver mRNA C&witrogen), 5) Normal fetal kidney mRNA 
(Invitrogen), 6) Normal fetal liver mRNA (Invitrogen), 7) normal fetal skin mRNA (Invitrogen), 8) 
human adrenal gland mRNA (Clontech), 9) Human bone marrow mRNA (Clontech), 10) Human 
leukemia lymphoblastic mRNA (Clontech), 11) Human thymus mRNA (Clontech), 12) human 
lymph node mRNA (Clontech), 13) human so\spinal cord mRNA (Clontech), 14) human thyroid 
mRNA (Clontech), 15) human esophagus mRNA (BioChain), 16) human conceptional umbilical 
cord mRNA (BioChain). 
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SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% 

Identitv 


1 


gi9837125 


Homo sapiens 


membrane-associated nucleic acid 
binding protein mRNA, partial cds. 


2553 


54 


1 


gi7020305 


Homo sapiens 


cDNA FLJ20301 fis, clone HEP06569. 


1728 


47 


1 


gi7294120 


Drosophila 
melanogaster 


CG16807 gene product 


1535 


53 


2 


AAY57911 


Homo sapiens 


Human transmembrane protein 

XI 1 IVli x^~J»J» 


1258 


82 


2 


AAB88406 


Homo sapiens 


Human membrane or secretory protein 

Hone PSFPOHW 


265 


39 


2 


gil4272664 


Homo sapiens 


unnamed protein product 


265 


39 


j 




nomo sapiens 


Similar to gp25L2 protein, clone 
MGC:2142 IMAGE:2967520, mRNA, 

L/Umpicic tub, 


1 1 1 a 


100 


3 


gil2845568 


Mus musculus 


putative 


1099 


98 


3 




numo sapiens 


ri.sapicns iiuxxn a ior gpz ji^z proiem. 




AO 


4 


gi9971050 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-526K24 on chromosome 20. 
vxiumius a novel gene, me j end ox a 

nriVf*1 orpni* txxtf\ Pnfr icl^nrlc T7QTPc 
liuvei gwiiCj iwu v_^j^jvj laiouUd, i_>o X o, 

GSSs and STSs, complete sequence. 


4348 


99 


4 


AAB95086 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 16999. 


3034 


99 


4 


gil0433753 


Homo sapiens 


cDNA HJ12307 fis, clone 
MAMMA1001908. 


3034 


99 


5 


gi4689106 


Homo sapiens 


NADH-ubiquinone oxidoreductase B8 
subunit 


505 


100 


5 


gi2909862 


Homo sapiens 


NADH-ubiquinone oxidoreductase 
subunit CI-B8 mRNA, complete cds. 


505 


100 


5 


gil2539408 


Homo sapiens 


NDUFA2 gene for NADH 
dehydrogenase (ubiquinone) 1 alpha 
subcomplex 2, complete cds. 


505 I 


100 


6 


AAG64416 


Homo sapiens 


Human nucleoprotein. 


3765 


100 


6 

• 


gil0443046 


Homo sapiens 


Human DNA sequence from clone 
RP 1 1-465L10 on chromosome 20. 
Contains 10 CpG islands, ESTs, STSs 
and GSSs. Contains the gene for a 
novel protein similar to Drosophila 
CGI 1399, the gene for a novel C2H2 
type zinc finger protein similar to 

chicken F7F-1 a Fern tin lioht 

lsiUl*A.eil £ * 1, A 1 CUlllll 11x5 ill 

polypeptide (FTL) pseudogene, the 
MMP9 gene for matrix 
metalloproteinase 9 (gelatinase B, 
92kD gelatinase, 92kD type IV 
collagenase) (CLG4B), a novel gene, 
the SLC12A5 gene for solute carrier 
family 12, (potassium-chloride 
transporter) member 5 (KIAA1 176) 
and the 3* end of gene KIAA1637, 
complete sequence. 


3765 


100 ! 


6 


gil5426514 


Homo sapiens 


clone MGC: 16205 1MAGE:3640928, 
mRNA, complete cds. 


3765 


100 


7 


AAG64416 


Homo sapiens 


Human nucleoprotein. 


3366 


100 
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Score 
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Identity 


7 


gil0443046 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-465L10 on chromosome 20. 
Contains 10 CpG islands, ESTs, STSs 
and GSSs. Contains the gene for a 
novel protein similar to Drosophila 
CGI 1399, the gene for a novel C2H2 
type zinc finger protein similar to 
chicken FZF-1, a Ferritin light 
polypeptide (FTL) pseudogene, the 
MMP9 gene for matrix 
metalloproteinase 9 (gelatinase B, 
92kD gelatinase, 92kD type IV 
collagenase) (CLG4B), a novel gene, 
the SLC12A5 gene for solute carrier 
family 12, (potassium-chloride 
transporter) member 5 (KIAA1 176) 
and the 3 end of gene KIAA1637, 
complete sequence. 


3366 


100 


7 


gil5426514 


Homo sapiens 


clone MGC:16205 IMAGE:3640928, 
mRNA, complete cds. 


3366 


100 


8 


gil4571904 


Rattus 
norvegicus 


lysosomal amino acid transporter 1 


2145 


85 


8 


AAE04910 


Homo sapiens 


Human transporter and ion channel-23 
(TRICH-23) protein. 


1239 


56 


8 


gi7297404 


Drosophila 
melanogaster 


CG13384 gene product 


837 


43 


9 


AAB73686 


Homo sapiens 


Human oxidoreductase protein ORP- 
19. 


1301 


98 


9 


gi7291405 


Drosophila 
melanogaster 


T3dh gene product 


808 


59 


9 


gi5824752 


Caenorhabditis 
elegans 


predicted using Genefinder~contains 
similarity to Pfam domain: PF00465 
(Iron-containing alcohol 
dehydrogenases), Score=177.7, E- 
value=1.9e-50, N=2~cDNA EST 
EMBL:Z14517 comes from this gene; 
cDNA EST ykl8d4.3 comes from this 
gene-cDNA EST ykl8d4.5 comes 
from this gene; cDNA EST ykl 16f5.5 
comes from this gene-cDNA EST 
ykl32h3.3 comes from this gene; 
cDNA EST yk73dl0.3 comes from this 
gene-cDNA EST yk93e9.3 comes from 
this gene; cDNA EST ykl32h3.5 
comes from this gene-cDNA EST 
yk73dl0.5 comes from this gene; 
cDNA EST yk93e9.5 comes from this 
gene-cDNA EST ykl35b6.5 comes 
from this gene; cDNA EST ykl35b6.3 
comes from this gene-cDNA EST 
yk201e5.3 comes from this gene; 
cDNA EST yk268bl.3 comes from this 
gene-cDNA EST yk261d6.3 comes 


685 


52 
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% 
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from this gene; cDNA EST yk262hl 1.3 
comes from this gene-cDNA EST 
yk292hl 1.3 comes from this gene; 
cDNA EST yk304d83 comes from this 
gene-cDNA EST yk344b7.3 comes 
from this gene; cDNA EST yk351a6.3 
comes from this gene~cDNA EST 
yk366d9.3 comes from this gene; 
cDNA EST yk368e3.3 comes from this 
gene-cDNA EST yk372cl 1.3 comes 
from this gene; cDNA EST yk389g3.3 
comes from this gene-cDNA EST 
yk422d2.3 comes from this gene; 
cDNA EST yk381d7.3 comes from this 
gene-cDNA EST yk201e5.5 comes 
from this gene; cDNA EST yk267f6.5 
comes from this gene-cDNA EST 
yk268bl.5 comes from this gene; 
cDNA EST yk261d6.5 comes from this 
gene-cDNA EST yk262hl 1.5 comes 
from this gene; cDNA EST yk292hl 1 .5 
comes from this gene-cDNA EST 
yk304d8.5 comes from this gene; 
cDNA EST yk344b7.5 comes from this 
gene-cDNA EST yk368e3.5 comes 
from this gene; cDNA EST yk372cl 1.5 
comes from this gene-cDNA EST 
yk351a6.5 comes from this gene; 
cDNA EST yk366d9.5 comes from this 
gene~cDNA EST yk389g3.5 comes 
from this gene; cDNA EST yk422d2.5 
comes from this gene~cDNA EST 
yk560f4.3 comes from this gene; 
cDNA EST yk625h5,3 comes from this 
gene-cDNA EST yk381d7.5 comes 
from this gene; cDNA EST yk560f4.5 
comes from this gene~cDNA EST 
yk625h5.5 comes from this gene 






10 


AAB73686 


Homo sapiens 


Human oxidoreductase protein ORP- 
19. 


1552 


99 


10 


gi7291405 


Drosophila 
melanogaster 


T3dh gene product 


891 


56 


10 


gi5824752 


Caenorhabditis 
elegans 


predicted using Genefmder-contains • 
similarity to Pfam domain: PF00465 
(Iron-containing alcohol 
dehydrogenases), Score=177.7> E- 
value=1.9e-50, N=2~cDNA EST 
EMBL:Z14517 comes from this gene; 
cDNA EST ykl 8d4.3 comes from this j 
gene-cDNA EST ykl8d4.5 comes 
from this gene; cDNA EST ykl 16f5.5 
comes from this gene-cDNA EST 
ykl32h3.3 comes from this gene; 
cDNA EST yk73dl0.3 comes from this 


730 


51 
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gene-cDNA EST yk93e9.3 comes from 
mis gene; cDNA EST ykl32h3.5 
comes from this gene-cDNA EST 
yk73dl0.5 comes from mis gene; 
cDNA EST yk93e9.5 comes from this 
gene-cDNA EST ykl35b6.5 conies 
from this gene; cDNA EST ykl35b6.3 
comes from this gene*~cDNA EST 
yk201e5.3 comes from this gene; 
cDNA EST yk268bl.3 comes from this 
gene~cDNA EST yk261d6.3 comes 
from this gene; cDNA EST yk262hll.3 
comes from this gene~cDNA EST 
yk292hl 1.3 comes from mis gene; 
cDNA EST yk304d8.3 comes from this 
gene-cDNA EST yk344b7.3 comes 
from this gene; cDNA EST yk351a6.3 
comes, from this gene~cDNA EST 
yk366d9.3 comes from mis gene; 
cDNA EST yk368e3.3 comes from this 
gene~cDNA EST yk372cl 1.3 comes 
from this gene; cDNA EST yk389g3.3 
comes from this gene~cDNA EST 
yk422d2.3 comes from this gene; 
cDNA EST yk38 ld7.3 comes from this 
gene-cDNA EST yk201e5.5 comes 
from this gene; cDNA EST yk267f6.5 
comes from this gene—cDNA EST 
yk268bl.5 comes from this gene; 
cDNA EST yk261d6.5 comes from mis 
gene-cDNA EST yk262hl 1.5 comes 
from this gene; cDNA EST yk292hl 1.5 
comes from this gene-cDNA EST 
yk304d8.5 comes from this gene; 
cDNA EST yk344b7.5 comes from this 
gene-cDNA EST yk368e3.5 comes 
from this gene; cDNA EST yk372cl 1.5 
comes from this gene~cDNA EST 
yk351a6.5 comes from this gene; 
cDNA EST yk366d9.5 comes from this 
generCUW A b, o 1 yK3o>g3.5 comes 
from this gene; cDNA EST yk422d2.5 
comes from this gene~cDNA EST 
yk560f4.3 comes from this gene; 
cDNA EST yk625h5.3 comes from this 
gene-cDNA EST yk381d7.5 comes 
from this gene; cDNA EST yk560f4.5 
comes from this gene— cDNA EST 
yk625h5.5 comes from this gene 






11 


AAB85166 


Homo sapiens 


Human Bcl-Gl polypeptide. 


1598 


87 


11 


gil4598300 


Homo sapiens 


unnamed protein product 


1598 


87 


11 


gi!2584085 


Homo sapiens 


apoptosis regulator BCL-G long form 
(BCLG) mRNA, complete cds. 


1598 


87 


12 


gil5077865 


Mus mus cuius 


bullous pemphigoid antigen 1-b 


1253 


82 



133 



WO 02/081731 



PCT/US02/01222 



Table 2 



SJLQ ID NO: 


Accession No. 


Species 


Description 


Score 


0/ 
70 

Identity 


12 


gil5077863 


Mus musculus 


bullous pemphigoid antigen 1-a 


1253 


82 


12 


gi6624582 


Homo sapiens 


Human DNA sequence from clone 
RP1-61B2 on chromosome 6pl 1.2-12.3 
Contains isoiorms 1 ana 3 ot HPAUl 
(bullous pemphigoid antigen 1 
(230/240kD), an exon of a gene similar 
to murine MACF cytoskeletal protein, 
STSs and GSSs, complete sequence. 


nil 

733 


99 


13 


gi3702270 


Homo sapiens 


-\ - - in i n 1 C\ n nn ■ ■ i . rl DOIilAO 

chromosome iy, cosmia KJ14Uo, 
complete sequence. 


oo/ 


93 


13 


gi401845 


Homo sapiens 


ribosomal protein LI 8a mRNA, 
complete cds. 


887 


93 


13 


gil3960144 


Homo sapiens 


ribosomal protein LI 8a, clone 
MoC:44/o lMAUiirzyoijiy, mKJNA, 
complete cds. 


887 


93 






Homo sapiens 


Breast and ovarian cancer associated 
anugen proicm sequence ony u-j /^o. 




OA 

oU 


I** 




— : 

Homo sapiens 


numan cancer associated proicm 




o 1 


14 


gil4198321 


Mus musculus 


ribosomal protein L3 1 


453 


81 






TTnfflf\ caniPnc 
nUIXlO SaplCUb 


mpMA f nr If T A A 1 r\mtf*m -nnrrinl 
LLUVLN/\. iUl ■^J_/VfVlUD'T piUlClIl, palUal 

cds. 


J OH J 


ion 


15 


pi4SR4^6A 

gltOOtJVJO 


PTnmn cam'pnc 


mRNA* cDNA DKF7n586T 1220 
(from clone DKFZp5 86L1220); partial 
cds. 


1628 


100 


15 


gil3161145 


Homo sapiens 


zinc finger protein mRNA, complete 
cds. 


369 


36 


16 j 




X/fiiQ miicpiiln<s 

1VXUO IHUBtUUliJ 




2494 


94 


16 


gi5870834 


Mus musculus 


skm-BOP2 ! 


2397 


91 


1£ 
io 


oil 800^99 


xvius inuscuius 


I'Dur 


29R<i 




17 


gil3938126 


Mus musculus 


RKEN cDNA 3732409C05 gene 


2678 


98 


17 


gil2852375 


Mus musculus 


putative 


2678 


98 


1/ 


gl/U244JJ 


lorpedo 
marmorata 


male sterility protein 2-like protein 




on 
oU 


18 


AAB95482 


Homo sapiens 


Human protein sequence SEQ ID 

JNLJUoUU/. 


1572 


67 


18 


gil4042809 


Homo sapiens 


cDNAHJ14932fis, clone 

TjT a rri Ann/con 
JrLAGc 1 UU9o39 . 


1572 


67 


18 


gil2053165 


Homo sapiens 


mRNA; cDNA DKFZp434K0427 
complete cds. 


1572 


67 


19 


gi7243159 


Homo sapiens 


mRNA for KIAA1389 protein, partial 
cds. 


7842 


99 


19 


gi4151328 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 
alpha mRNA, complete cds. 


3777 


53 


19 


gi4151330 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 
beta mRNA, complete cds. 


3768 


53 


20 


gi7243159 


Homo sapiens 


mRNA for K1AA1389 protein, partial 
cds. 


7714 


98 


20 


gi4151328 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP1 


3806 


54 
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DJ&Vl UJ INUJ 


Accession No. 


Species 


li escription 


Score 


0/ 
/© 

lueuiiiy 








alpha mRNA, complete cds. 






OA 

zu 


gl41Mjil) 


Homo sapiens 


high-risk human papilloma viruses E6 
oncoproteins targeted protein E6TP 1 
oeia mKJN a, compiere cos. 


p fyf 




21 


AAB95328 


Homo sapiens 


Human protein sequence SEQ ID 
CsU.l fjyj. 


753 


61 


21 


AAB93757 


Homo sapiens 


Human protein sequence SEQ ED 


753 


61 


Zl 




— : 

Homo sapiens 


Human membrane-associated protein 
nUJVlAr-H. 




Ol 


22 


gi7673373 


Homo sapiens 


SCAN-related protein RAZ1 (RAZ1) 
mRNA, partial cds. 


1104 


100 


22 


AAG93274 


Homo sapiens 


Human protein HP10543. 


900 


100 


22 


AAB42846 


Homo sapiens 


Human ORFX ORF2610 polypeptide 
sequence SEQ ID NO:5220. 


900 


100 


23 


gi7242963 


Homo sapiens 


mRNA for KIAA 1304 protein, partial 
cds. 


5409 


99 


23 


gi3413874 


Homo sapiens 


mRNA for KIAA0456 protein, partial 
cds. 


3695 


67 


23 


AAB30852 


Homo sapiens 


Amino acid sequence of human signal 
transduction protein SGT6-1, 


3685 


68 


24 


AAG64386 


Homo sapiens 


Human alcohol dehydrogenase 39. 


1228 


77 


24 


gil2861800 


Mus museums 


putative 


1083 


66 


24 


gi3878713 


Caenorhabditis 
elegans 


weak similarity with quinone 
oxidoreductase, contains similarity to 
Pfam domain: PF00107 (Zinc-bmding 
dehydrogenases), Score—80.6, E- 

i rnln a & 1o A/C XT — 1 /»TYKTA T3CT 

value— o.ze-uo, rs— i~cjdin a Ho i 
ykl64b4.5 comes from this 
gene^cL/iN/v no 1 yKio^D^f.o comes 
from this gene-cDNA EST yk264B.5 

r/kniM frnttt tliic amp 


556 


39 


25 


AAE02629 


Homo sapiens 


Human secreted protein 2alpha37. 


2481 


100 


25 


gil4536691 


Homo sapiens 


unnamed protein product 


2481 


100 


25 


AAY99419 


Homo sapiens 


Human PRO1780 (UNQ842) amino 
aciu sequence ony ali i>kj:ZoZ. 


1960 


77 


26 


gi6102869 


Homo sapiens 


mRNA; cDNA DKFZp434Hl235 
(trom clone UKrzp434rilZ35); partial 
cds. 


831 


100 




gllZo3J*l37 


ivLus muscuius 


puiduvc 


lory 




26 


gi2198807 


Gallus gallus 


monocarboxylate transporter 3 


505 


29 


27 


gi7299069 


Drosophila 
melanogaster 


CGI 1755 gene product 


205 


34 


27 


gB875367 


Caenorhabditis 
elegans 


contains 3 cysteine rich repeats 


136 


41 


27 


gi3249080 


Arabidopsis 
thaliana 


Contains similarity to MYB 
transcription factor isolog T01O24.1 
gb|2288980 from A, thaliana BAC 
gb|AC002335. 


69 


35 


28 


gil 1041628 


Homo sapiens 


RPL6 gene for ribosomal protein L6, 
complete cds. 


1207 


98 


28 


gi433416 


Homo sapiens 


Human mRNA for DNA-binding 
protein, TAXREB107, complete cds. 


1207 


98 
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Arppccinn ISfri 




riACPfinfinn 

isc&cr ijjuuu 


oturc 


/o 

Identity 


28 


gil3278717 


Homo sapiens 


ribosomal protein L6, clone 
MGC:1635 IMAGE:2823733, mRNA, 
complete cds. 


1207 


98 




AAUUJolU 


Jtiomo sapiens 


Human secretea protein, ot>y jlu in w, 
7891. 


has 


1Aft 
IvO 


29 


gil86800 


Homo sapiens 


Human ribosomal protein L12 mRNA, 
complete cds. 


845 


100 


2y 


gl 1415/03 J J 


Homo sapiens 


ribosomal protein L12, clone 
MGQ9760 IMAGE:3855674, mRNA, 
complete cds. 


©43 


1 t\f\ 


30 


AAB95051 


Homo sapiens 


Human protein sequence SEQ ID 
XNU:ioo4y. 


2965 


100 


30 


gil0433519 


Homo sapiens 


cDNA FLJ121 18 fis, clone 
MAMMA1000085, weakly similar to 
PUTATIVE CYSTEINYL-TRNA 
oYN IHETAoB C29.bo.OoC (EC 

O.1.1.10J. 


2965 


100 


30 


gil3938199 


Homo sapiens 


hypothetical protein FU121 18, clone 

JVHjV^.13U44 UVIAvjCZoZZD J /, mKJNA, 

complete cds. 


2959 


99 


j i 




\zfito mnonilnc 
XVIUb mUbvUlUb 


putative 


QAM 
Z*rf 1 




31 


gi7959195 


Homo sapiens 


mRNA for KIAA1467 protein, partial 
cds. 


2232 


100 


31 


gil3278148 


Mus mnsculus 


Similar to RIKEN cDNA 8430419L09 
gene 


794 


83 


32 


gi!5530305 


Homo sapiens 


Similar to RIKEN cDNA 1700045119 

gcuv) nunc Lvx\j\j.z, yt / 

IMAGE:3509621, mRNA, complete 
cds. 


1245 


84 


32 


ri9858803 


A/fiie musculn*? 

1T1UO II lUJvtllU J 


Zfb228 


512 


47 


32 


AAG75629 


Homo sapiens 


Human colon cancer antigen protein 
SEQH)NO:6393. 


511 


46 


33 


gi810l071 


Homo sapiens 


golgin-like protein (GLP) gene, 


312 


46 








gUJgUU-U&C piULCLU ^Ul/f ^ HJX\J.N/\, 


^19 
jii 


HO 


33 


gil 1037008 


Human 

TiAiT%f>ax/irtiQ R 

UvipCoYUUo O 


latent nuclear antigen 


245 


40 


34 


gi437985 


Canis 
familiaris 


Rab 12 protein 1 


1071 


99 


34 


gi206531 


Rattus 
norvegicus 


RAB12 


995 


96 


34 


Kil2851149 


Mus musculus 


putative 


819 


96 


35 


gil3543689 


Homo sapiens 


Similar to RIKEN cDNA 4933405K01 
gene, clone MGC: 14799 
IMAGE:4068454, mRNA, complete 
cds. 


1077 


96 


35 


gil2805373 


Mus musculus 


Unknown (protein for MGC:7298) 


950 


84 


35 


gil2855529 


Mus musculus 


putative 


642 


79 


36 


gil2697979 


Homo sapiens 


mRNA for KIAA1717 protein, partial 
cds. 


1982 


100 


36 


gil651678 


Synechocystis 
sp. PCC6803 


ORE JD:slrl485~hypothetical protein 


185 


34 
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36 


gi2739367 


Arabidopsis 
thaliana 


putative phosphatidylinositol-4- 
phosphate 5-kinase 


153 


28 


37 


gi3800892 


Homo sapiens 


neurexin Ill-alpha gene, partial cds. 


1255 


99 


37 


gi294602 


Rattus 
norvegicus 


neurexin Hi-alpha 


1160 


91 


37 


gi205716 


Rattus 
norvegicus 


neurexin D-alpha-a 


561 


50 


38 


gil0047315 


Homo sapiens 


mRNA for KIAA1619 protein, partial 
cds. 


4447 


99 


38 


gi8217424 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-108L7 on chromosome 10, 
contains part of the gene for a novel 
Insulin-like growth factor binding type 
protein with Kazal-type serine protease 
inhibitor domain, the gene for a novel 
protein similar to rat tricarboxylate 
carrier, the gene for a novel PDZ 
(DHR, GLGF) domain protein, the 
gene for a novel protein similar to 
KIAA0552, KIAA0341 and Fugu 
hypothetical protein 2, the gene for a 
novel protein similar to Plasmodium 
POM1 and C. elegans F46G1 1 . 1, a 
putative novel gene, the SEMA4G gene 
for semaphonn 4G and a novel gene. 
Contains ESTs, STSs, GSSs and seven 
putative CpG islands, complete 
sequence. 


4407 


99 


38 


gi4836757 


Mus musculus 


semaphorin subclass 4 member G 


4021 


90 




glIU*00004 


Homo sapiens 


cDNA: FU22324 fis, clone 
HRC05551. 


307 


100 


39 


gil3559240 


Homo sapiens 


Human DNA sequence from clone 
RP5-842G6 on chromosome 20. 
Contains the 3 end of a novel gene, the 
3* end of the gene for a novel protein 
similar to SEL1L (sel-1 (suppressor of 
lin-12, C.elegans)-like), ESTs, STSs 
ana Uoos, complete sequence. 


307 


100 


39 


gil3543669 


Homo sapiens 


hypothetical protein FU22324, clone 
MGC:14701 IMAGE-4247211 mRNA 
complete cds. 


307 


100 


40 


gil4595019 


Homo sapiens 


mRNA for keratin 6 irs (KRT6IRS 
gene). 


2615 


99 


40 


gi6092075 


Mus musculus 


typellcytokeratin 


2414 


91 


40 


gil5559584 


Homo sapiens 


Similar to keratin 6A, clone 
MGC:20671 IMAGE:3639270, mRNA, 
complete cds. 


1468 


57 


41 


gil2655452 


Homo sapiens 


mRNA for keratin associated protein 
4.7 (KRTAP4.7 gene). 


1157 


86 


41 


gil2655464 


Homo sapiens 


partial mRNA for keratin associated 
protein 4.15 (KRTAP4.15 gene). 


1090 


88 


41 


gil2655462 


Homo sapiens 


mRNA for keratin associated protein 
4.14 (KRTAP4.14 gene). 


1063 


84 
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42 


gi553772 


Homo sapiens 


Human Tcr-C-delta gene, exons 1-4; 
Tcr-V-delta gene, exons 1-2; T-cell 
receptor alpha (Tcr-alpha) gene, J1-J61 
segments; and Tcr-C-alpha gene, exons 
1-4. 


110 


100 


42 


gi4379087 


Homo sapiens 


mRNA for TCR alpha variable region, 
patient AF31. 


73 


46 


42 


AAW40057 


Homo sapiens 


Cellular transcriptional factor p300. 


71 


42 


43 


gil5866589 


Capsella 
rubella 


hypothetical protein 


97 


30 


43 


gi3879045 


Caenorhabditis 
elegans 


R102.6 


96 


34 


43 


AAY56133 


Homo sapiens 


Human N-methyl-D-aspartate receptor 
2"subunit SEQ ID NO:54. 


94 


52 


44 


gil3569345 


Homo sapiens 


pregnancy-associated plasma 
preproprotein-A2 mRNA, complete 
cos. 


9839 


99 


44 


gil0639043 


Homo sapiens 


mRNA for pregnancy-associated 
, plasma protein-E (PAPPE gene). 


8966 


99 


44 


gil 142970 


Homo sapiens 


Human pregnancy-associated plasma 
protein-A preproform (PAPPA) 
mRNA, complete cds. 


3856 


45 


45 


gil2851017 


Mus musculus 


putative 


578 


83 


45 


gi4490653 


Schizosacchar 
omyces pombe 


profilin. 


186 


35 


45 


gi440266 


Acanthamoeba 
castellanii 


profilin I 


166 


34 


46 


gil617480 


Comamonas 
testosteroni 


unknown 


712 


82 


46 


gi3046394 


Ralstonia 
eutropba 


phbF 


563 


66 


46 


gi6683782 


Burkholderia 
sp. DSMZ 
9242 


unknown 


560 


61 


47 


gi9229934 


Mus musculus 


midnolin 


2103 


78 


47 


AAB56832 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO:1410. 


912 


71 


47 


gil5929300 


Homo sapiens 


Similar to midnolin, clone 
IMAGE:3958934, rnRNA, partial cds. 


907 


100 


48 


gil3377624 


Homo sapiens 


cahcm mRNA, complete cds. 


3089 


99 


48 


gi854100 


Homo sapiens 


H.sapiens mRNA for calicin (partial). 


3076 


99 


48 


gi853784 


Bos taurus 


calicin 


2896 


91 


49 


AAB68411 


Homo sapiens 


Amino acid sequence of a human 
NOV2 polypeptide. 


2131 


100 


49 


AAY99407 


Homo sapiens 


Human PR01337 (UNQ692) amino 
acid sequence SEQ ID NO:236. 


2101 


99 


49 


AAB68414 


Homo sapiens 


Amino acid sequence of NOV2 
polypeptide clone TA-cgALl 32708 A. 


2014 


99 


50 


gil2082748 


Mus musculus 


T-box transcription factor TBX18 


2972 


93 


50 


gi5102617 


Homo sapiens 


Human DNA sequence from clone 
33L1 on chromosome 6ql4.1«15. 
Contains the gene for novel T-box 
(Brachyury) family protein. Contains 


2634 


100 
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ESTs, STSs, GSSs and two putative 
CpG islands, complete sequence. 






50 


gil284966i 


Mus musculus 


putative 


2223 


96 


51 


gi 12843048 


Mus musculus 


putative 


339 


72 


51 


gi6691626 


Homo sapiens 


RAGE mRNA for advanced glycation 
endproducts receptor, complete cds. 


111 


32 


51 


gil90846 


Homo sapiens 


Human receptor for advanced 
glycosylation end products (RAGE) 
mRNA, partial cds. 


111 


32 


52 


AAG71840 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQID NO: 1521. 


1313 


85 


52 


AAG71839 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQID NO: 1520. 


1226 


81 


52 


AAG71837 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQID NO: 1518. 


1159 


77 


53 


AAB94026 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14163. 


966 


98 


53 


gil0433955 


Homo sapiens 


cDNA FIJI 2457 fis, clone 
NT2RM1000666, weakly similar to 
DNA-BINDING PROTEIN A. 


966 


98 


53 


gi7295442 


Drosophila 
melanogaster 


CG17334 gene product 


302 


47 


54 


gi8980396 


Homo sapiens 


mRNA for T-cell antigen receptor- 
alpha, clone Pil- 1 a, partial. 


566 


97 


54 


gi2358063 


Homo sapiens 


T-cell receptor alpha delta locus from 
bases 752679 to 1000555 (section 4 of 
5) of the Complete Nucleotide 
Sequence. 


565 


100 


54 


gi623149 


Macaca 
mulatta 


T-cell receptor alpha 


512 


85 


55 


gi2792496 


Rattus 
norvegicus 


tulip 2 


2437 


86 


55 


gi4884288 


Homo sapiens 


mRNA; cDNADKFZp566D133 (from 
clone DKFZp566D133); partial cds. 


1983 


99 


55 


AAB41763 


Homo sapiens 


Human ORFX ORF1527 polypeptide 
sequence SEQ ID NO:3054. 


1976 


98 


56 


gil5524592 


Homo sapiens 


unnamed protein product 


1033 


52 


56 


gi537514 


Homo sapiens 


Human arylacetamide deacetylase 
mRNA, complete cds. 


1033 


52 


56 


AAB54079 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:53 1. 


1017 


51 


57 


AAB33831 


Homo sapiens 


Human secreted protein BLAST search 
protein SEQ ID NO: 175. 


149 


35 


57 


gil 109682 


Bos taurus 


G-protein gamma- 12 subunit 


149 


35 


57 


AAW09416 


Homo sapiens 


Human G protein gamma-7 subunit. 


144 


33 


58 


gil 2082750 


Mus musculus 


T-box transcription factor TBX20 


1469 


93 


58 


gi9909810 


Mus musculus 


T-box transcription factor 


1469 


93 


58 


gi7229717 


Danio rerio 


H15-related T-box transcription factor 
hrT 


1346 


85 


59 


gi4185946 


Human 
endogenous 
retrovirus K 


gag protein 


146 


26 [ 


59 


Ki5802821 


Homo sapiens 


endogenous retrovirus HERV-K108, 


146 


26 
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complete sequence. 






59 


gi5802814 


Homo sapiens 


endogenous retrovirus HERV-K103, 
complete sequence. 


146 


26 


60 


AAB94756 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15815. 


126 


42 


60 


gi332612 


Gibbon ape 
leukemia virus 


pol polyprotein 


113 


50 


60 


gi3 133302 


Sus scrofa 


pol protein 


110 


53 


61 


gil0121625 


Gillichtfays 
mirabilis 


60S acidic ribosomal protein PI 


127 


81 


61 


AAB44012 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1457. 


125 


78 


61 


AAB43434 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO:879. 


125 


78 


62 


AAB12585 


Homo sapiens 


Human T cell activating protein SEQ 
IDNO:4. 


140 


37 


62 


gil2805221 


Mus muse ill us 


lymphocyte antigen 6 complex 


140 


37 


62 


gil98924 


Mus mus cuius 


Ly-6A.2 


140 


37 


63 


gi6969165 


Homo sapiens 


Human DNA sequence from clone 
RP3-475N16 on chromosome 6pl2.3- . 
21.2. Contains the genes for CTG4A, 
pre-T cell receptor alpha, a novel 
protein similar to RPL7A (60S 
ribosomal protein L7A) and the 3* end 
of gene KIAA0240. Contains ESTs, 
STSs, GSSs and four putative CpG 
islands, complete sequence. 


573 


67 


63 


gil2841727 


Mus musculus 


putative 


512 


59 


63 


gil5293877 


Ictahirus 
punctatus 


ribosomal protein L7 


314 


38 


64 


gil81573 


Homo sapiens 


Human cytokeratin 8 (CK8) gene, 
complete cds. 


1147 


79 


64 


gil81400 


Homo sapiens 


Human cytokeratin 8 mRNA, complete 
cds. 


1147 


78 


64 


gi400416 


Homo sapiens 


H.sapiens KRT8 mRNA for keratin 8. 


1147 


79 


65 


gil3620887 


Mus musculus 


mitochondrial ribosomal protein S6 


633 


100 


65 


gil 3620885 


Homo sapiens 


MRPS6 mRNA for mitochondrial 
ribosomal protein S6, partial cds. 


565 


85 


65 


gil4603226 


Homo sapiens 


clone MGC: 19576 IMAGE:4304420, 
mRNA, complete cds. 


565 


85 


66 


gil3537119 


Homo sapiens 


mRNA for PAR-6 gamma, complete 
cds. 


1956 


100 


66 


gi8037909 


Mus musculus 


PAR6A 


1490 


76 


66 


gi9453884 


Homo sapiens 


mRNA for 16-5-5^ partial cds. 


1304 


93 


67 


AAB95293 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17517. 


776 


79 


67 


AAG81270 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:58. 


776 


79 


67 


gil4035848 


Homo sapiens 


unnamed protein product 


776 


79 


68 


gi7020759 


Homo sapiens 


cDNA FU20565 fis, clone REC00542. 


930 


60 


68 


gil5216181 


Homo sapiens 


mRNA for putative 67-1 1-3 protein. 


927 ' 


60 


68 


gil5930069 


Homo sapiens 


Similar to hypothetical protein 
FLJ20565, clone MGC:8850 


917 


60 
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IMAGE:39 14396, mRNA, complete 
cds. 






69 


gi3228237 


Homo sapiens 


UHS KerB gene. 


810 


72 


69 


gi200962 


Mus musculus 


serine 1 ultra high sulfur protein 


755 


69 


69 


gi32472 


Homo sapiens 


H.sapiens mRNA for high-sulphur 
keratin. 


749 


71 


70 


AAB92789 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 11284. 


3518 


100 


70 


gi7022420 


Homo sapiens 


cDNAFLJ10407fis, clone 
NT2RM4000520. 


3518 


100 


70 


gil3111786 


Homo sapiens 


hypothetical protein FU 10407, clone 
MGC:970 IMAGE:3509727, mRNA, 
complete cds. 


3511 


99 


71 


gil3325178 


Homo sapiens 


Similar to RIKEN cDNA 2210016F16 
gene, clone MGC: 10999 
IMAGE:3638524, mRNA, complete 
cds. 


856 


100 


71 


gi7291278 


Drosophila 
melanogaster 


CG9752 gene product 


744 


43 


71 


gi2854153 


Caenorhabditis 
elegans 


Hypothetical protein CI 1D2.4 


729 


45 


72 


gi7020991 


Homo sapiens 


cDNA FU20718 fis, clone HEP17872. 


3013 


100 


72 


gil5680144. 


Homo sapiens 


hypothetical protein FU20718, clone 
IMAGE:4577269, mRNA, partial cds. 


2906 


99 


72 


gil0801646 


Macaca 
fascicularis 


hypothetical protein 


1097 


99 


73 


AAG93290 


Homo sapiens 


Human protein HP 1 0650. 


1215 


100 


73 


gil4587195 


Homo sapiens 


FAPPl-associated protein 1 (FASP1) 
mRNA, complete cds. 


1215 


100 


73 


gi81 18225 


Homo sapiens 


chromosome 21 unknown mRNA. 


1215 


100 


74 


gil0436998 


Homo sapiens 


cDNA: FLJ21011 fis, clone 
CAE04289. 


2522 


100 


74 


gil5030282 


Homo sapiens 


clone MGO.16827 IMAGE:3855873, 
mRNA, complete cds. 


2522 


100 


74 


gi8570641 


Homo sapiens 


clone 133K02 unknown mRNA. 


2514 


99 


75 


gi6599255 


Homo sapiens 


mRNA; cDNA DKFZp434C0328 
(from clone DKFZp434C0328). 


1612 


100 


75 


gi6330416 


Homo sapiens 


mRNA for KIAA1201 protein, partial 
cds. 


554 


38 


75 


AAB74726 


Homo sapiens 


Human membrane associated protein 
MEMAP-32. 


496 


35 


76 


gi7021059 


Homo sapiens 


cDNA FLJ20758 fis, clone HEP01508. 


1450 


100 


76 


AAW88552 


Homo sapiens 


Secreted protein encoded by gene 19 
clone HSAVU34. 


1429 


100 


76 


gil 534 1707 


Homo sapiens 


clone MGC:19979 IMAGE:3939273, 
mRNA, complete cds. 


1429 


100 


77 


AAB95410 


Homo sapiens 


Human protein sequence SEQ ID 
N0.17796. 


774 


100 


77 


gil0435394 


Homo sapiens 


CDNAFU13391 fis, clone 
PLACE1001241. 


774 


100 


77 


gil0503974 


Homo sapiens 


clone SP24 unknown mRNA. 


765 


99 ! 


78 


gi7020587 


Homo sapiens 


cDNA FU20467 fis, clone KAT06638. 


737 


100 
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78 


AAB42883 


Homo sapiens 


Human ORFX ORF2647 polypeptide 
sequence SEQ ID NO:5294. 


530 


100 


78 


AAB56642 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO:1220. 


530 


100 


79 


AAW93948 


Homo sapiens 


Human regulatory molecule HRM-4 
protein. 


441 


91 


79 


gil 2852696 


Mus musculus 


putative 


386 


47 


79 


gil2751103 


Homo sapiens 


PNAS-129 mRNA, complete cds. 


348 


100 


80 


gi7243053 


Homo sapiens 


mRNA for KIAA1336 protein, partial 
cds. 


3851 


99 


80 


gi7292144 


Drosophila 
melanogaster 


CG2069 gene product 


1634 


44 


80 


gil065457 


Caenorhabditis 
elegans 


C54G7.4 gene product 


706 


25 


81 


gil0439581 


Homo sapiens 


cDNA; FLJ23023 fis, clone 
LNG01678. 


652 


100 


81 


gi7021132 


Homo sapiens 


CDNAFLJ20813 fis, clone 
ADSE01247. 


652 


100 


81 


AAG74674 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO;5438. 


556 


92 


82 


gi5262611 


Homo sapiens 


mRNA; cDNA DKF2£434I1 14 (from 
clone DKFZp434Il 14); complete cds. 


838 


100 


82 


gil 1493368 


Homo sapiens 


Human DNA sequence from clone 
RP5-1009E24 on chromosome 20 
Contains the SN gene encoding 
sialoadhesin, a novel gene similar to 
KIAA0417, the CENPB gene for 
centromere protein B, the CDC25B 
gene for Cell division cycle protein 
25B, three novel genes, the 5' end of 
gene KIAA1271, nine CpG islands, 
ESTs, STSs and GSSs, complete 
sequence. 


838 


100 


82 


gil3543798 


Mus musculus 


RKEN cDNA 4931426K16 gene 


680 


92 


83 


AAB57003 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO:1581. 


1302 


99 


83 


AAR60558 


Homo sapiens 


Humanbasiginl. 


1302 


99 


83 


gi3492872 


Homo sapiens 


chromosome 19, cosmid F18382 
(LLNLF-140D2) and T overlapping 
restriction fragment, complete 
sequence. 


1302 


99 


84 


gi9187614 


Homo sapiens 


mRNA full length insert cDNA clone 
EUROIMAGE 1759349. 


580 


100 


84 


AAB01394 


Homo sapiens 


Neuron-associated protein. 


70 


39 


84 


AAB54358 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:810. 


70 


39 


85 


gil5986445 


Homo sapiens 


p90 autoantigen mRNA, complete cds. 


4513 


99 


85 


gi7959315 


Homo sapiens 


mRNA for KIAA1524 protein, partial ; 
cds. 


4357 


99 


85 


AAB95207 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17311. 


2341 


100 


86 


gi7959231 


Homo sapiens 


mRNA for KIAA1485 protein, partial 
cds. 


5813 


99 
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86 


AAB40418 


Homo sapiens 


Human ORFX ORF182 polypeptide 
sequence SEQ ID NO:364. 


708 


99 


86 


gi5901529 


Homo sapiens 


C2H2 type Kruppel-like zinc finger 
protein splice variant b (ZNF236) 
mRNA, complete cds. 


520 


24 


87 


gi7243270 


Homo sapiens 


mRNA for KIAA1436 protein, partial 
cds. 


4604 


99 


87 


gi5051974 


Mus musculus 


F2 alpha prostoglandin regulatory 
protein 


4195 


89 


87 


gil054884 


Rattus 
norvegicus 


prostaglandin F2a receptor regulatory 
protein precursor 


4191 


88 


88 


gi 1324 1286 


Mus musculus 


GABA(A) receptor-associated protein- 
like 2 


607 


100 


88 


gi2104570 


Rattus 
norvegicus 


GEF-2 


607 


100 


88 


gi4433387 


Bos taurus 


general protein transport factor pi 6 


607 


100 


89 


gil5859535 


Homo sapiens 


unnamed protein product 


5935 


99 


89 


gi3043606 


Homo sapiens 


mRNA for KIAA0541 protein, partial 
cds. 


5890 


100 


89 


gil5624075 


Homo sapiens 


TGF-beta resistance-associated protein 
TRAG (TRAG) mRNA, partial cds. 


5719 


96 


90 


gi337370 


Homo sapiens 


Human rapamycin- and FK5 06-binding 
protein, complete cds. 


740 


100 


90 


gil3097252 


Homo sapiens 


Similar to FK506 binding protein 2 (13 
kDa), clone MGC:5 177 
IMAGE:3445148, mRNA, complete 
cds. 


740 


100 


90 


AAQ31004 aa 
1 


Homo sapiens 


hRFKBP cDNA. 


735 


99 


91 


gil2053147 


Homo sapiens 


mRNA; cDNA DKFZp434F1726 (from 
clone DKFZp434F1726). 


1450 


100 


91 


gi412195 


Homo sapiens 


unknown 


265 


98 


91 


AAR04931 


Homo sapiens 


Interferon-gamma receptor segment 
from clone 39 responsiblefor binding 
the target 


260 


96 


92 


gil0437948 


Homo sapiens 


cDNA: FU21783 fis, clone HEP00284, 


3276 


100 


92 


AAB95352 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17643. 


1953 


99 


92 


gil0435077 


Homo sapiens 


cDNAFU13171 fis, clone 
NT2RP3003819. 


1953 


99 


93 


gi!2803319. 


Homo sapiens 


clone MGC:3090 IMAGE:3347913, 
mRNA, complete cds. 


4837 


99 


93 


gil4044064 


Homo sapiens 


hypothetical protein DKFZp762Ml 15, 
clone MGC:14418 IMAGE:4302613, 
mRNA, complete cds. 


4831 


99 


93 


gil0047337 


Homo sapiens 


mRNA for KIAA1630 protein, partial 
cds. 


4671 


100 


94 


AAB70535 


Homo sapiens 


Human PROS protein sequence SEQ 
ID NO: 10. 


2979 


100 


94 


gil3185719 


Homo sapiens 


unnamed protein product 


2979 


100 


94 


AAB94106 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14334. 


2334 


100 


95 


gil2837873 


Mus musculus 


putative 


2370 


75 
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95 


gil3 195574 


Mus musculus 


Prajal isoforma 


2339 


75 


95 


AAB93847 


Homo sapiens 


Human protein sequence SEQ ID 
NO:13691. 


1941 


99 


96 


gi2224543 


Homo sapiens 


Human mRNA for KIAA0301 gene, 
partial cds. 


10626 


100 


96 


gi7529572 


Homo sapiens 


Human DNA sequence from clone 
RPi-12208 on chromosome 6ql4.2- 
16.1. Contains the 3' part of a novel 
gene partially coded for by KIAA0301, 
a novel gene and the 3' part of the gene 
KIAA0957. Contains ESTs, STSs, 
. GSSs and a putative CpG island, 
complete sequence. 


10626 


100 


96 


gil0727627 


Drosophila 
melanogaster 


CG13185 gene product 


1452 


34 


97 


AAB82318 


Homo saoiens 


Human immunoglobulin receotor 
IRTA5 protein. 


2235 


100 


97 




Homo ^aniens 


Fc recentor-like Drotein 1 fFCRHl^ 
mRNA, complete cds. 


2235 


100 


97 




Hnmo <;aniens 


Human DNA sea ne nee from clone 
RP11-367J7 on chromosome 1. 
Contains (part of) two or more genes 
for novel Immunoglobulin domains 
containing proteins, a SON DNA 
binding protein (SON) pseudogene, a 
voltage-dependent anion channel 1 
(VDAC1) (plasmalemmal porin) 
pseudogene, ESTs, STSs and GSSs, 
complete sequence. 


1533 


100 


98 


AAB82318 


Homo sapiens 


Human immunoglobulin receptor 
IRTA5 protein. 


2177 


98 


98 


gil5528831 


Homo sapiens 


Fc receptor-like protein 1 (FCRH1) 
mRNA, complete cds. 


2177 


98 


98 


ei9930921 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-367J7 on chromosome 1 . 
Contains (part of) two or more genes 
for novel Immunoglobulin domains 
containing proteins, a SON DNA 
binding protein (SON) pseudogene, a 
voltage-dependent anion channel 1 
(VDAC1) (plasmalemmal porin) 
pseudogene, ESTs, STSs and GSSs, 
complete sequence. 


1533 


100 


99 


gil0438861 


Homo sapiens 


cDNA: FU22461 fis, clone 
HRC10107. 


4904 


100 


99 


gil5079400 


Homo sapiens 


clone MGC: 16796 IMAGE:3855477, 
mRNA, complete cds. 


4899 


99 


99 


AAU03497 


Homo sapiens 


Human sterol sensing domain protein. 


4047 


99 


100 


gi6524024 


Mus musculus 


mammalian inositol hexakisphosphate 
kinase 1 


1031 


50 


100 


gil0280996 


Rattus 
norvegicus 


inositol hexakisphosphate kinase 


1027 


49 


100 


Ki6683115 


Homo sapiens 


mRNA for K1AA0263 protein, partial 


1021 


49 
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cds. 






101 


gi6524024 


Mus mus cuius 


mammalian inositol hexakisphosphate 
kinase 1 


1037 


51 


101 


gil0280996 


Rattus 
norvegicus 


inositol hexakisphosphate kinase 


1033 


50 


101 


gi6683115 


Homo sapiens 


mRNA for KIAA0263 protein, partial 
cds. 


1027 


50 


102 


gil3623311 


Homo sapiens 


clone EMAGE:3948563, mRNA, 
partial cds. 


1629 


100 


102 


gi3135968 


Homo sapiens 


Human DNA sequence from clone 
XXbac-3418 on chromosome 6p21.3- 
22.1. Contains the 5' end of the 
ZNF184 gene for Kruppel-like zinc 
ringer protein 184, a heterogeneous 
nuclear ribonucleoprotein Al 
(HNRPA1) pseudogene, a CD83 
antigen pseudogene, ESTs, STSs, GSSs 
and three CpG islands, complete 
sequence. 


1627 


47 


102 


gil769491 


Homo sapiens 


Human kruppel-related zinc finger 
protein (ZNF184) mRNA, partial cds. 


1625 


47 


103 


gil6198398 


Homo sapiens 


clone MGC:27353 IMAGE:4671816, 
mRNA, complete cds. 


2606 


85 


103 


gi829151 


Homo sapiens 


H.sapiens ZNF37A mRNA for zinc 
finger protein. 


1371 


99 


103 


gi9801232 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-508N22 on chromosome 10 
Contains part of a novel gene 
(HSPC025), part of the ZNF37A (zinc 
ringer protein 37a (KOX 21)) gene, 
part of a putative novel gene, ESTs, 
STSs, GSSs and a CpG Island, 
complete sequence. 


1371 


99 


104 


gil2053123 


Homo sapiens 


mRNA; cDNA DKFZp434K1421 
(from clone DKFZp434K1421); 
complete cds. 


2624 


100 


104 


gi7292866 


Drosophila 
melanogaster 


CG15747 gene product 


362 


31 


104 


gi7549210 


Babesia 
bigemina 


200 kDa antigen p200 


298 


21 


105 


gil2053123 


Homo sapiens 


mRNA; cDNA DKFZp434K1421 
(from clone DKFZp434K1421); 
complete cds. 


2898 


100 


105 


gi6841130 


Homo sapiens 


HSPC095 mRNA, partial cds. 


419 


100 


105 


gi7292866 


Drosophila 
melanogaster 


CG15747 gene product 


364 


30 


106 


gil0438207 


Homo sapiens 


cDNA: FLJ21977 fis, clone HEP05976. 


1978 


99 


106 


gil5012167 


Homo sapiens 


hypothetical protein FLJ21977, clone 
MGC:14918 IMAGE:3936410, mRNA, 
complete cds. 


1974 


99 


106 


AAB42499 


Homo sapiens 


Human ORFX ORF2263 polypeptide 
sequence SEQ ID NO:4526. 


1392 


100 


107 


gil228035 


Homo sapiens 


Human mRNA for K1AA0191 gene, 


8020 


99 
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partial cds. 






107 


gi!2697967 


Homo sapiens 


mRNA forKIAA1711 protein, partial 
cds. 


1593 


58 


107 


AAB94636 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 15515. 


1004 


52 


108 


AAG81252 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:22. 


2146 


99 


108 


gil4035812 


Homo sapiens 


unnamed protein product 


2146 


99 


108 


gil0440123 


Homo sapiens 


cDNA: FU23436 fis, clone 
HRC12692. 


2054 


100 


109 


gi200009 


Mus musculus 


myosin I 


5386 


96 


109 


gil666471 


Mus musculus 


myosin I heavy chain 


5360 


94 


109 


gi56733 


Rattus 
norvegicus 


myosin I heavy chain 


5268 


91 


110 


gil2053045 


Homo sapiens 


mRNA; cDNA DKFZp434Kl 115 
(from clone DKFZp434Kl 115); 
complete cds. 


4840 


100 


110 


AAB65631 


Homo sapiens 


Novel protein kinase, SEQ ED NO: 158. 


4835 


99 


110 


gil4133215 


Homo sapiens 


mRNA for KIAA0781 protein, partial 
cds. 


4678 


100 


111 


gil2642596 


Homo sapiens 


nuclear receptor co-repressor/HDAC3 
complex subunit TBLR1 (TBLR1) 
mRNA, complete cds. 


2725 


100 


111 


AAB95225 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17352. 


2720 


99 


111 


gil0434648 


Homo sapiens 


cDNA FIJI 2894 fis, clone 
NT2RP2004170, moderately similar to 
Homo sapiens mRNA for transducin 
(beta) like 1 protein. 


2720 


99 


112 


gi2224557 


Homo sapiens 


Human mRNA for KIAA0308 gene, 
partial cds. 


6666 


99 


112 


AAY23330 


Homo sapiens 


Human tumour suppressor (kismet) 
protein. 


5759 


98 


112 


gi7243213 


Homo sapiens 


mRNA for KIAA1416 protein, partial 
cds. 


5264 


59 j 


113 


gil2856019 


Mus musculus 


putative 


1527 


95 


113 


gi3947604 


Caenorhabditis 
elegans 


cDNA EST ykl29fl.3 comes from this 
gene-cDNA EST ykl29fl.5 comes 
from this gene~cDNA EST yk203e4.3 
comes from this gene-cDNA EST 
ykl91a9.3 comes from this 
gene-cDNA EST yk262cl0.3 comes 
from this gene-cDNA EST yk278£9.3 
comes from this gene~cDNA EST 
yk325c7.3 comes from this 
gene~cDNA EST yk337fl.3 comes 
from this gene-cDNA EST yk449a2.3 
comes from this gene~cDNA EST 
yk203e4.5 comes from this 
gene-cDNA EST ykl91a9.5 comes 
from this gene-cDNA EST yk278f9.5 
comes from this gene~cDNA EST 
yk262cl0.5 comes from this 


787 


41 | 
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gene-cDNA EST yk325c7.5 comes 
from this gene-cDNA EST yk337fl .5 
comes from this gene-cDNA EST 
yk448gl0.5 comes from this 
gene-cDNA EST yk449a2.5 comes 
from this gene-cDNA EST yk636e2.3 
comes from this gene-cDNA EST 
yk636e2.5 comes from this 
gene~cDNA EST yk550e8.3 comes 
from mis gene-cDNA EST yk557a9.3 
comes from this gene-cDNA EST 
yk579cl2.3 comes from this 
gene-cDNA EST yk614e7.3 comes 
from this gene-cDNA EST yk653fl.3 
comes from this gene~cDNA EST 
yk672b2.3 comes from this 
gene-cDNA EST yk550e8.5 comes 
from this gene-cDNA EST yk556bl .5 
comes from this gene~cDNA EST 
yk557a9.5 comes from this 
gene-cDNA EST yk579cl2.5 comes 
from mis gene-cDNA EST yk606c8.5 
comes from this gene-cDNA EST 
yk614e7.5 comes from this gene 






113 


gi3947603 


Caenorhabditis 
elegans 


cDNA EST yk!67h7.3 comes from this 
gene-cDNA EST ykl67h7.5 comes 
from this gene-cDNA EST yk289g5.3 
comes from mis gene-cDNA EST ' 
yk332h9.3 comes from this 
gene-cDNA EST yk289g5.5 comes 
from this gene-cDNA EST yk332h9.5 
comes from this gene-cDNA EST 
yk391h4.5 comes from mis 
gene-cDNA EST yk653fl.5 comes 
from this gene 


787 


41 


114 


gi9280136 


Macaca 
fascicularis 


unnamed protein product 


3431 


95 


114 


gi4262617 


Caenorhabditis 
elegans 


contains similarity to dual specificity 
phosphatase, catalyitic domain 
(Pfam:PF00782, Score=16.8, E=7.4e- 
05,N=1) 


470 


35 


114 


gi5706724 


Homo sapiens 


Cdcl4B3 phosphatase mRNA, 
complete cds. 


166 


30 


115 


AAB95254 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17423. 


3114 


99 


115 


gil4042385 


Homo sapiens 


cDNA FLT14693 fis, clone 
NT2RP2005360, weakly similar to 
Homo sapiens sentrin/SUMO-specific 
protease (SENP1) mRNA. 


3114 


99 


115 


gil03 14023 


Homo sapiens 


sentrin-specific protease (SENP2) 
mRNA, complete cds. 


3107 


99 


116 


gi4240227 


Homo sapiens 


mRNA for KIAA0869 protein, partial 
cds. 


4417 


98 
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116 


gil3879506 


Mus musculus 


Unknown (protein for 
IMAGE:3963643) 


4063 


89 


116 


AAB93267 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 12300. 


1895 


97 


117 


gil3235092 


Homo sapiens 


mRNA for testis specific protein A14 
(TSGA14 gene). 


1957 


100 


117 


gil0438839 


Homo sapiens 


cDNA: FLJ22445 fis, clone 
HRC09438. 


1950 


99 


117 


gil3235344 


Mus musculus 


testis specific protein a 14 


1704 


87 


118 


gi7959279 


Homo sapiens 


mRNA for KIAA1509 protein, partial 
cds. 


6769 


99 


118 


AAB94101 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14322. 


1871 


99 


118 


gil0434073 


Homo sapiens 


cDNAFU12531fis, clone 
NT2RM4000199. 


1871 


99 


119 


AAM00936 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 412. 


3350 


100 


119 


AAB42828 


Homo sapiens 


Human ORFX ORF2592 polypeptide 
sequence SEQ ID NO:5184. 


2064 


100 


119 


gi9557949 


Homo sapiens 


mRNA for hypothetical protein 
(ORF1), clone 

Telethon(Italy B41) Strait02270 FL1 
42. 


1931 


100 


120 


AAB11082 


Homo sapiens 


Human secreted protein ZALPHA13 
j>rotein. 


2783 


93 


120 


gil 1230043 


Homo sapiens 


unnamed protein product 


2783 


93 


120 


AAB37988 


Homo sapiens 


Human secreted protein encoded by 
gene 5 clone HDPAS92. 


2747 


93 


121 


gil2852526 


Mus musculus 


putative 


1689 


80 


121 


AAB41765 


Homo sapiens 


Human ORFX ORF1529 polypeptide 
sequence SEQ ID NO:3058. 


1576 


100 


121 


gi4406663 


Homo sapiens 


clone 24945 mRNA sequence, partial 
cds. 


1576 


100 


122 


AAR22958 


Homo sapiens 


Human proteasome component HC5. 


1010 


85 


122 


gi220026 


Homo sapiens 


Human mRNA for proteasome subunit 
HC5. 


1010 


85 


122 


gi3790135 


Homo sapiens 


Human DNA sequence from clone 
RP1-191N21 on chromosome 6q27. 
Contains a 7 transmembrane receptor 
(rhodopsin family) (olfactory receptor 
like) pseudogene, the PDCD2 gene for 
programmed cell death 2 (RP8 
homolog), the TBP gene for TATA box 
binding protein, the gene for 
proteasome subunit HC5, ESTs, STSs 
and GSSs, complete sequence. 


1010 


85 


123 


AAB21027 


Homo sapiens 


Human nucleic acid-binding protein, 
NuABP-31. 


1456 


100 


123 


AAB45146 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 27 SEQ ID NO: 87. 


1456 


100 


123 


gi4884258 


Homo sapiens 


mRNA; cDNA DKFZp564O092 (from 
clone DKFZp564O092); partial cds. 


1430 


100 


124 


gil3325436 


Homo sapiens 


Similar to RIKEN cDNA 


1394 


100 



148 
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C330013D18 gene, clone MGC:11226 
IMAGE:3937599, mRNA, complete 
cds. 






124 


gil3559363 


Homo sapiens 


MRPL9 mRNA for mitochondrial 
ribosomal protein L9 (L9mt), complete 
cds. 


1388 


99 


124 


AAG93251 


Homo sapiens 


Human protein HP02612. 


1153 


86 


125 


AAB85507 


Homo sapiens 


Human protein kinase SGK164. 


2949 


100 


125 


eil3543922 


Homo saniens 


Similar to RBCEN cDNA 5430416A05 
gene, clone MGC: 12903 
IMAGE # 3537086 mRNA comnlete 
cds. 


2913 


100 


125 


eil2856491 


Mus imisculus 


nutative 


2135 


79 


126 


gil2653817 


Homo sapiens 


Similar to Male-specific RNA 84Dd, 
cloiie MGC'3092 IMAGE'3349383 
mRNA, complete cds. 


3399 


100 


126 


AAB94115 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14356. 


3392 


99 


126 


gil0434102 


Homo sapiens 


cDNAFU12549fis, clone 
NT2RM4000689 


3392 


99 


127 


gi7243187 


Homo sapiens 


mRNA for KIAA1403 protein, partial 
cds. 


6448 


98 


127 


gil2652971 


Homo sapiens 


clone MGC:858 IMAGE:33 57380, 

mRNA comnlete cds 


3992 


100 


127 


AAB92872 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 11460. 


3987 


99 


128 


AAB94324 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14807. 


1779 


99 


128 


gil0434528 


Homo sapiens 


cDNAFU12816fis, clone 
NT2RP2002609 weaklv similar to 2- 
HYDROXYMUCONIC 
SEMIALDEHYDE HYDROLASE (EC 
3.1.1.-). 


1779 


99 


128 


AAB42143 


Homo sapiens 


Human ORFX ORF1907 polypeptide 
sequence SEQ ID NO:3814. 


1521 


100 


129 


gi6329945 


Homo sapiens 


mRNA for KIAA1 140 orotein. nartial 
cds. 


1857 


52 


129 


gil2805043 


Homo sapiens 


clone IMAGE:3461487, mRNA, 
partial cds. 


1279 


54 


129 


gi7302173 


Drosophila 
melanogaster 


BcDNA:LD2 1719 gene product 


1261 


35 


130 


AAB28199 


Homo sapiens 


Human HMG-17 non histone 
chromosomal protein. 


322 


75 


130 


gi306864 


Homo sapiens 


Human non-histone chromosomal 
protein HMG-17 mRNA, complete cds. 


322 


75 


130 


gi32329 


Homo sapiens 


Human HMG-17 gene for non-histone 
chromosomal protein HMG-17. 


322 


75 


131 


gil6041794 


Homo sapiens 


clone MGC:23591 IMAGE:4856946, 
mRNA, complete cds. 


2714 


99 


131 


gil 5559462 


Homo sapiens 


Similar to old astrocyte specifically 
induced substance, clone MGC:20215 
IMAGE:4546950, mRNA, complete 
cds. 


2709 


99 
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131 


gi4519621 


Mus musculus 


OASIS protein 


2406 


91 


132 


gi7573591 


Homo sapiens 


Human DNA sequence from clone 
RP1-309K20 on chromosome 20 
Contains the gene for a novel protein 
similar to dysferlin, the SPAG4 gene 
for sperm associated antigen 4, the 
CPNE1 gene for Copine I (similar to 
KIAA0636), the gene KIAA0765 
(HRIHFB2091) for an RNA 
recognition motif (RNP, RRM or RBD 
domain) containing protein and the 3' 
end of the NIFS gene for cysteine 
desulfurase. Contains ESTs, STSs, 
GSSs and four putative CpG islands, 
complete sequence. 


4972 


100 


132 


gil5559252 


Homo sapiens 


RNA binding motif protein 12, clone 
MGC:19528 MAGE:3 845090, mRNA, 
complete cds. 


4972 


100 


132 


gil5215375 


Homo sapiens 


RNA binding motif protein 12, clone 
MGC: 16487 LMAGE:3956772, mRNA, 
complete cds. 


4972 


100 


133 


gil2697774 


Mus musculus 


acetyl-CoA synthetase 2 


3181 


87 


133 


gil2697772 


Bos taurus 


acetyl-CoA synthetase 2 


3056 


83 


133 


AAB34712 


Homo sapiens 


Human secreted protein encoded by 
DNA clone vo9 1. 


2721 


100 


134 


gi7020783 


Homo sapiens 


cDNA FU20580 fis, clone REC00516. 


848 


100 


134 


gil5012026 


Homo sapiens 


Similar to hypothetical protein 
FLJ20580, clone MGC:13430 
IMAGE:4093763, mRNA, complete 
cds. 


848 ; 


100 


134 


gil2833008 


Mus musculus 


putative 


814 


85 


135 


AAB94473 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15139. 


1970 


100 


135 


AAG74880 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5644. 


1970 


100 


135 


AAB43720 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1 165. 


1970 


100 


136 


gil0047285 


Homo sapiens 


mRNA for KIAA1605 protein, partial 
cds. 


3610 


99 


136 


gil6215453 


Homo sapiens 


mRNA for bile acid beta-glucosidase. 


3610 


99 


136 


gil5030210 


Homo sapiens 


KIAA1605 protein, clone MGC:16895 
IMAGE:4339156, mRNA, complete 
cds. 


3610 


99 


137 


gi4914601 


Homo sapiens 


mRNA; cDNA DKPZp564A026 (from 
clone DKFZp564A026). 


4171 


94 


137 


AAB94357 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14881. 


2195 


99 


137 


AAY45161 


Homo sapiens 


Human secreted protein clone 
C0139 3 protein sequence. 


2112 


100 


138 


gi313131 


Torpedo 
marmorata 


alpha-tubulin 


1192 


97 


138 


gil4198110 


Mus musculus 


tubulin alpha 1 


1192 


97 


138 


gil3435777 


Mus musculus 


tubulin alpha 6 


1192 


97 
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139 


AAB94856 


Homo sapiens 


Human protein sequence SEQ ID 
NO:16044. 


2138 


100 


139 


AAB94628 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 15490. 


2138 . 


100 


139 


gil0436294 


Homo sapiens 


cDNAFU13970 fis, clone 
Y79AA1001533, moderately similar to 
Mouse mRNA for RNA polymerase I 
associated factor (PAF53). 


2138 


100 


140 


gi2642187 


Rattus 
norvegicus 


endo-alpha-D-mannosidase 


1415 


61 


140 


AAB95204 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17303. 


1094 


66 


140 


gil0434559 


Homo sapiens 


cDNA FLJ12838 fis, clone 
NT2RP2003230, moderately similar to 
Rattus norvegicus endo-alpha-D- 
mannosidase (Enman) mRNA. 


1094 


66 


141 


gi3449308 


Homo sapiens 


mRNA for MEGF8, partial cds. 


9785 


100 


141 


gi6681364 


Rattus 
norvegicus 


MEGF8 


4772 


95 


141 


gil0728654 


Drosophila 
melanogaster 


CG7466 gene product 


2902 


34 


142 


AAY29517 


Homo sapiens 


Human lung tumour protein SAL-82 
predicted amino acid sequence. 


3048 


100 


142 


gil3958036 


Homo sapiens 

• 


FYVE-finger protein EIP1 mRNA, 
complete cds. 


3048 


100 


142 


AAY29861 


Homo sapiens 


Human secreted protein clone cb98 4. 


3041 


99 


143 


gi!4718539 


Homo sapiens 


HIC-3 mRNA, complete cds. 


3178 


99 


143 


gi5689371 


Homo sapiens 


mRNA for KIAA 1020 protein, partial 
cds. 


2970 


99 


143 


gi7328028 


Homo sapiens 


mRNA; cDNA DKFZp434F0616 (from 
clone DKFZp434F0616); partial cds. 


1738 


100 


144 


gil2620400 


Homo sapiens 


mitochondrial carrier protein CGI-69 
long form mRNA, complete cds. 


1856 


99 


144 


AAB42783 


Homo sapiens 


Human ORFX ORF2547 polypeptide 
sequence SEQ ID NO:5094. 


1804 


96 


144 


gil0438783 


Homo sapiens 


cDNA: FU22407 fis, clone 
HRC08407. 


1798 


97 


145 


gi2792366 


Homo sapiens 


unknown protein IT 12 mRNA, partial 
cds. 


4390 


99 


145 


gil843399 


Homo sapiens 


mRNA, partial cds, clone:RES4-25. 


3676 


99 


145 


gil4602505 


Homo sapiens 


clone 1MAGE:3936655, mRNA, 
partial cds. 


2366 


99 


146 


gil3359167 


Homo sapiens 


mRNA for KIAA1646 protein, partial 

cds. 


2581 


99 


146 


AAY96059 


Homo sapiens 


Human sphingosine kinase C. 


2456 


99 


146 


gi6572330 


Homo sapiens 


Human DNA sequence from clone 
59H18 on chromosome 22. Contains 
the 3' part of the gene for KIAA0767, a 
novel gene, ESTs, STSs, GSSs and a 
putative CpG island, complete 
sequence. 


1627 


96 


147 


gil4043303 


Homo sapiens 


exonuclease NEF-sp, clone 
MGQ15944 IMAGE:3537866, mRNA, 


4043 


100 
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complete cds. 






147 


gil3272524 


Homo sapiens 


exonuclease NEF-sp mRNA, complete 
cds. 


4039 


99 


147 


gil2053043 


Homo sapiens 


mRNA; cDNA DKFZp434J0315 (from 
clone DKFZp434J0315); complete cds. 


3843 


95 


148 


gi7243037 


Homo sapiens 


mRNA for KIAA1328 protein, partial 
cds. 


2894 


100 


148 


gil3874541 


Macaca 
fascicularis 


hypothetical protein 


2492 


93 


148 


gil335313 


Homo sapiens 


Human muscle mRNA for embryonic 
myosin heavy chain (SMHCE). 


129 


24 


149 


AAB42399 


Homo sapiens 


Human ORFX ORF2163 polypeptide 
sequence SEQ ID NO:4326. 


1362 


91 


149 


AAB42366 


Homo sapiens 


Human ORFX ORF2130 polypeptide 
sequence SEQ ID NO:4260. 


626 


100 


149 


gi7298594 


Drosophila 
melanogaster 


CG10189 gene product 


223 


35 


150 


AAB95372 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17692. 


1538 


99 


150 


gil0435150 


Homo sapiens 


cDNA FLJ13220 fis, clone 
NT2RP4002047, moderately similar to 
GTP-BINDING PROTEIN LEPA. 


1538 


99 


150 


gil0437720 


Homo sapiens 


cDNA: FLJ21595 fis, clone 
COL07069. 


1438 


100 


151 


gi3327080 


Homo sapiens 


mRNA for KIAA0633 protein, partial 
cds. 


6823 


99 


151 


gi857571 


Mus musculus 


cordon-bleu gene product 


1345 


81 


151 


gi6094680 


Homo sapiens 


PAC clone RP5-1 168M19 from 7pl2- 
ql 1.21, complete sequence. 


1342 


100 


152 


gil5451265 


Macaca 
fascicularis 


hypothetical protein 


2728 


98 


152 


AAB41597 


Homo sapiens 


Human ORFX ORF1361 polypeptide 
sequence SEQ ID NO:2722. 


2650 


100 


152 


gi5689443 


Homo sapiens 


mRNA for KIAA1053 protein, partial 
cds. 


2650 


100 


153 


gil4036062 


Homo sapiens 


unnamed protein product 


1930 


100 


153 


AAG81377 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:272. 


1925 


99 


153 


gil2833112 


Mus musculus 


putative 


1727 


88 


154 


gil2832455 


Mus musculus 


putative 


1220 


89 


154 


gil5080314 


Homo sapiens 


Similar to RDCEN cDNA 0610010D20 
gene, clone MGC:20590 
MAGE:43 10241, mRNA, complete 
cds. 


514 


100 


154 


gi6002488 


Penicillium 
chrysogenum 


hypothetical protein 


338 


31 


155 


gil4017889 


Homo sapiens 


mRNA for KIAA1836 protein, partial 
cds. 


2511 


100 


155 


AAB94592 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 15402. 


972 


50 


155 


gil0435321 


Homo sapiens 


CDNAFLJ13337 fis, clone 
OVARC1001880. 


972 


50 


156 


gil4550510 


Homo sapiens 


pseudouridylate synthase 1, clone 


2123 


100 



152 
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MGC:2736 IMAGE:2822709, mRNA, 
complete cds. 






156 


gil2804097 


Homo sapiens 


Similar to pseudouridine synthase 1, 
clone MGC: 1 1268 IMAGE:3943243, 
mRNA, complete cds. 


2123 


100 


156 


gi4455035 


Homo sapiens 


pseudouridine synthase 1 (PUS1) 
mRNA, partial cds. 


1927 


99 


157 


AAY58052 


Homo sapiens 


Human protein kinase H2LAU20 
protein sequence. 


3198 


98 


157 


gi9652080 


Homo sapiens 


protein kinase DYRK4 (DYRK4) 
mRNA, partial cds. 


2844 


100 


157 


AAW71685 


Homo sapiens 


Amino acid sequence of human 
serine/threonine protein kinase. 


1909 


97 


158 


gi7300952 


Drosophila 
melanogaster 


BcDNA:LD21504 gene product 


971 


62 


158 


gi4972728 


Drosophila 
melanogaster 


unknown 


971 


62 


158 


AAB97646 


Homo sapiens 


Ribosomal S3 protein 17. 


831 


99 


159 


AAU02201 


Homo sapiens 


Phosphatase 1 protein-like protein, 
MEM6. 


1514 


100 


159 


gil5551577 


Homo sapiens 


unnamed protein product 


1514 


100 


159 


AAB95633 


Homo sapiens 


Human protein sequence SEQ ID 
NO:18363. 


1510 


99 


160 


gi!2804573 


Homo sapiens 


Similar to CGI 1334 gene product, 
clone MGC:3207 IMAGE:3501899, 
mRNA, complete cds. 


1859 


100 


160 


gil2851419 


Mus musculus 


putative 


1590 


86 


160 


gi7302053 


Drosophila 
melanogaster 


CGI 1334 gene product 


1046 


59 


161 


gil580781 


Homo sapiens 


Human beige-like protein (BGL) 
mRNA, partial cds. 


9734 


99 


161 


gil0180266 


Mus musculus 


LBA 


9333 


86 


161 


gil0257401 


Mus musculus 


LB A isoform beta 


8920 


86 


162 


gil5082589 


Homo sapiens 


clone MGC:4408 IMAGE:2906200, 
mRNA, complete cds. 


2065 


99 


162 


gil5638615 


Arabidopsis 
thaliana 


HEN1 


350 


37 


162 


gil3241746 


Arabidopsis 
thaliana 


CORYMBOSA2 


350 


37 


163 


gil5291227 


Drosophila 
melanogaster 


GH13040p 


701 


40 


163 


gi7303780 


Drosophila 
melanogaster 


CG12214 gene product 


701 


40 


163 


AAB95882 


Homo sapiens 


Human protein sequence SEQ ID 
NO:18991. 


501 


100 


164 


gD327170 


Homo sapiens 


mRNA for KIAA0678 protein, partial 
cds. 


5255 


100 


164 


AAB95304 '■ 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 17542. 


4431 


99 


164 


gil4134120 


Caenorhabditis 
elegans 


endocytosis protein RME-8 


2127 


42 


165 


AAB53427 


Homo sapiens 


Human colon cancer antigen protein 
sequence SEQ ID NO:967. 


813 


96 
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165 


gil3905098 


Mus musculus 


B-cell translocation gene 1, anti- 
proliferative 


813 


96 


165 


gi293306 


Mus musculus 


B-cell translocation gene-1 protein 


813 


96 


166 


gil3365897 


Macaca 
fascicularis 


hypothetical protein 


2501 


97 


166 


AAY02168 


Homo sapiens 


A facilitative glucose transporter 
protein GLUT8. 


870 


99 


166 


gil3445575 


Homo sapiens 


facilitative glucose transporter 
GLUT10 (SLC2A10) mRNA, complete 
cds. 


835 


39 


167 


gil3365897 


Macaca 
fascicularis 


hypothetical protein 


2173 


97 


167 


AAY02168 


Homo sapiens 


A facilitative glucose transporter 
protein GLUTS. 


870 


99 


167 


gil3445575 


Homo sapiens 


facilitative glucose transporter 
GLUT10 (SLC2A10) mRNA, complete 
cds. 


678 


37 


168 


gil0047251 


Homo sapiens 


mRNA for KIAA1588 protein, partial 
cds. 


3292 


100 


168 


gil4424704 


Homo sapiens 


clone MGC:15071 IMAGE:4 110510, 
mRNA, complete cds. 


2315 


100 


168 


gi4567179 


Homo sapiens 


chromosome 19, BAC 37295 (CIT-B- 
21A4), complete sequence. 


1269 


43 


169 


gil5558943 


Homo sapiens 


guanylate binding protein 4 mRNA, 
complete cds. 


3134 


99 


169 


gil 174187 


Mus musculus 


purine nucleotide binding protein 


2260 


70 


169 


gil93444 


Mus musculus 


guanylate binding protein 


1986 


66 


170 


gil4585859 


Homo sapiens 


hypothetical protein SB 138 


1121 


100 


170 


gi6665778 


Mus musculus 


cyclin ania-6b 


1052 


92 


170 


gil2841169 


Mus musculus 


putative 


1052 


92 


171 


AAB 64407 


Homo sapiens 


Amino acid sequence of human 
intracellular signalling molecule 
INTRA39. 


3394 


100 


171 


AAB71963 


Homo sapiens 


Human TGF-beta receptor encoded by 
cDNA clone HFEHY04. 


3394 


100 


171 


gil0438U3 


Homo sapiens 


cDNA: FLJ21908 fis, clone HEP03830. 


3385 


99 


172 


gil2652533 


Homo sapiens 


clone MGC:2637 IMAGE:3505128, 
mRNA, complete cds. \ 


676 


89 


172 


AAB67453 


Homo sapiens 


Amino acid sequence of a human 
chaperone polypeptide. 


668 


88 


172 


gi9758421 


Arabidopsis 
thaliana 


gene_id:MHF15.7^imilar to unknown 
protein- 


199 


28 


173 


AAB97025 


Homo sapiens 


Human colon carcinoma suppressor 
gene-related protein. 


1773 


61 


173 


gi9857318 


Homo sapiens 


Asef mRNA for APC-stimuiated 
guanine nucleotide exchange factor, 
complete cds. 


1773 


61 


173 


gi8809845 


Homo sapiens 


chromosome 2q22 RhoGEF mRNA, 
complete cds. 


1700 


61 


174 


gil2052828 


Homo sapiens 


mRNA; cDNADKFZp564N1062 
(from clone DKFZp564N1062); 
complete cds. 


1601 


99 


174 


gil2850603 


Mus musculus 


putative 


1062 


92 
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174 


AAB94655 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 15568. 


671 


100 


175 


gil5080282 


Homo sapiens 


Similar to putative sialoglycoprotease 
type 2, clone MGC:20293 
IMAGE:4121450, mRNA, complete 
cds. 


1747 


99 


175 


gil 1071727 


Homo sapiens 


mRNA for putative sialoglycoprotease 
type 2. 


1707 


92 


175 


gil2847276 


Mus musculus 


putative 


1541 


84 


176 


AAB36628 


Homo sapiens 


Human FLEXHT-50 protein sequence. 
SEQIDNO:50. 


527 


100 


176 


AAB94208 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14557. 


527 


100 


176 


AAG01512 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
5593. 


527 


100 


177 


gil5929052 


Homo sapiens 


Similar to RIKEN cDNA 2810442016 
gene, clone MGC:23197 
IMAGE:4861869, mRNA, complete 
cds. 


2084 


100 


177 


gill493 155 


Homo sapiens 


Human DNA sequence from clone 
RP5-852M4 on chromosome 20. 
Contains the gene encoding the HBV 
associated factor, a novel gene similar 
to Drosophilia CG17883, a putative 
novel gene, two CpG islands, ESTs, 
GSSs, andSTSs, complete sequence. 


1952 


100 


177 


gil2840168 


Mus musculus 


putative 


1938 


93 


178 


AAB87034 


Homo sapiens 


Human secreted protein TANGO 339, 
SEQIDNO:3. 


1449 


100 


178 


AAY76266 


Homo sapiens 


Human secreted protein encoded by 
gene 10 fragment 


1449 


100 


178 


AAB87135 


Homo sapiens 


Human secreted protein TANGO 339 
F20Y variant, SEQ ID NO:139. 


1446 


99 j 


179 


gi434763 


Homo sapiens 


Human mRNA for KIAA0120 gene, 
complete cds. ' 


1048 


100 


179 


gil4424677 


Homo sapiens 


transgelin 2, clone MGC: 1 5279 
IMAGE:4301018, mRNA, complete 
cds. 


1048 


100 


179 


gi9956026 


Homo sapiens 


clone CDABP0035 mRNA sequence. 


1048 


100 


180 


AAB31677 


Homo sapiens 


Amino acid sequence of a human 
protein having a hydrophobic domain. 


2803 


100 


180 


AAE03346 


Homo sapiens 


Human gene 19 encoded secreted 
protein HCRNF14, SEQ ID NO: 120. 


2803 


100 


180 


AAE03310 


Homo sapiens 


Human gene 19 encoded secreted 
protein HCRNF14, SEQ ID NO:84. 


2803 


100 


181 


AAB41910 


Homo sapiens 


Human ORFX ORF1674 polypeptide 
sequence SEQ ID NO:3348. 


1530 


99 


181 


gi5262467 


Homo sapiens 


mRNA; cDNA DKFZp564I122 (from 
clone DKFZp564I122). 


1530 


99 


181 


gil2849716 


Mus musculus 


putative 


1259 


82 


182 


gi2072972 


Homo sapiens 


Human LI element LI. 25 p40 and 
putative pi 50 genes, complete cds. 


497 


53 


182 


AAB64943 


Homo sapiens 


Human secreted protein sequence 


494 


54 
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encoded by gene 7 SEQ ID NO:121. 






182 


gi5070622 


Homo sapiens 


retrotransposonLl insertion in X- 
iinked retinitis pigmentosa locus, 
complete sequence. 


494 


53 


183 


AAB59191 


Homo sapiens 


Human NADE. 


217 


47 


183 


gi8452894 


Homo sapiens 


p75NTR-associated cell death executor 
(NADE) mRNA, complete cds. 


217 


47 


183 


gi 189379 


Homo sapiens 


Human unknown protein from clone 
pHGR74 mRNA, complete cds. 


217 


47 


184 


AAB88468 


Homo sapiens 


Human membrane or secretory protein 
clone PSEC0263. 


4931 


97 


184 


gil4272788 


Homo sapiens 


unnamed protein product 


4931 


97 


184 


gi577301 


Homo sapiens 


Human mRNA for KIAA0090 gene, 
partial cds. 


4650 


99 


185 


AAG64953 


Homo sapiens 


Human ATP-dependent helicase 
protein 68. 


3169 


100 


185 


gil2052748 


Homo sapiens 


mRNA; cDNA DKFZp564B1023 
(from clone DKFZp564B1023); 
complete cds. 


2716 


100 


185 


gil2836314 


Mus musculus 


putative 


2655 


83 


186 


gil4017781 


Homo sapiens 


mRNA for KIAA1782 protein, partial 
cds. 


2834 


99 


186 


gi4062983 


Mus musculus 


Eos protein 


2747 


95 


186 


gil 1612390 


Homo sapiens 


zinc finger transcription factor Eos 
mRNA, complete cds. 


2603 


98 


187 


AAB95721 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 18592. 


2419 


100 


187 


gil0436538 


Homo sapiens 


cDNA FLJ14153 fis, clone 
NT2RM1000092, weakly similar to 
MULTIDRUG RESISTANCE 
PROTEIN 2. 


2419 


100 


187 


gil2248763 


Homo sapiens 


mRNA for SMAP-4, complete cds. 


2323 


96 ! 


188 


gil 3278906 


Homo sapiens 


clone MGC:4440 IMAGE:2959536, 
mRNA, complete cds. 


1040 


100 


188 


gil3278819 


Homo sapiens 


clone MGC:2776 LMAGE:2959536, 
mRNA, complete cds. 


1040 


100 


188 . 


AAB95829 


Homo sapiens 


Human protein sequence SEQ ID 
NO:18847. 


618 


79 


189 


gil4602977 


Homo sapiens 


Similar to KIAA0789 gene product, 
clone MGC:16602 IMAGE:4 110708, 
mRNA, complete cds. 


3100 


99 


189 


gi3043570 


Homo sapiens 


mRNA for KIAA0523 protein, partial 
cds. 


2564 


100 


189 


gil4133217 


Homo sapiens 


mRNA for KIAA0789 protein, partial 
cds. 


1463 


49 


190 


gi97 17245 


Mus musculus 


cytoplasmic dynein heavy chain 


5569 


98 


190 


gi294543 


Rattus 
norvegicus 


dynein heavy chain 


5557 


98 


190 


gi402528 


Rattus 
norvegicus 


cytoplasmic dynein heavy chain 


5535 


98 


191 


gil3537204 


Homo sapiens 


mRNA forMAST205, complete cds. 


6834 


98 


191 


gi406058 


Mus musculus 


protein kinase 


6343 


86 


191 


gi3882335 


Homo sapiens 


mRNA for KIAA0807 protein, partial 


6300 


98 
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cds. 






192 


gil2847109 


Mus musculus 


putative 


1356 


79 


192 


gil3623271 


Homo sapiens 


Similar to RIKEN cDNA 2600005P05 
gene, clone MGC:11321 
MAGE:3951804, mRNA, complete 
cds. 


1332 


100 


192 


gil2847837 


Mus musculus 


putative 


1170 


76 


193 


gi38149 


Pongo 
pygmaeus 


epsilon-globin 


397 


100 


193 


gi903731 


Gorilla gorilla 


epsilon-globin 


397 


100 


193 


gi903707 


Pan 

troglodytes 


epsilon-globin 


397 


100 


194 


AAB74695 


Homo sapiens 


Human membrane associated protein 
MEMAP-1. 


1799 


100 


194 


AAE01340 


Homo sapiens 


Human gene 22 encoded secreted 
protein fragment, SEQ ID NO:205. 


1799 


100 


194 


gil5929183 


Homo sapiens 


modulator of apoptosis 1, clone 
MGC:9487 IMAGE:3922055, mRNA, 
complete cds. 


1799 


100 


195 


AAG93260 


Homo sapiens 


Human protein HP 10106. 


1769 


100 


195 


gil5029765 


Mus musculus 


RIKEN cDNA 2810039M17 gene 


1650 


91 


195 


gil2849932 


Mus musculus 


putative 


1650 


91 


196 


gil4017843 


Homo sapiens 


mRNA for KIAA1813 protein, partial 
cds. 


3434 


100 


196 


gil5 193290 


Homo sapiens 


LAPSER1 (LAPSER1) mRNA, 
complete cds. 


3309 


100 


196 


gi8217421 


Homo sapiens 


Human DNA sequence from clone 
RP11-108L7 on chromosome 10. 
contains part of the gene for a novel 
Insulin-like growth factor binding type 
protein with Kazal-type serine protease 
inhibitor domain, the f?ene for a novel 

UUiiUllVt UUI 1 1*1 Mil VUv fivilw 1U1 v* Uv Yvl 

protein similar to rat tricarboxylate 
carrier, the eene for a novel PDZ 
(DHR, GLGF) domain protein, the 
gene for a novel protein similar to 
KIAA0552, KIAA0341 and Fugu 
hypothetical protein 2, the gene for a 
novel protein similar to Plasmodium 
POM1 and C. elegans F46G11.1, a 
putative novel gene, the SEMA4G gene 
for semaphorin 4G and a novel gene. 
Contains ESTs, STSs, GSSs and seven 
putative CpG islands, complete 
sequence. 


3264 


100 


197 


gil458241 


Caenorhabditis 
elegans 


Hypothetical protein B0507.2 


782 . 


39 


197 


gil2832510 


Mus musculus 


putative 


490 


89 


197 


AAB54014 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:466. 


242 


100 


198 


gi500747 


Mus musculus 


capping protein beta-subunit, isofonn 1 


1440 


98 


198 


gi212902 


Gallus gallus 


actin-capping protein Z beta subunit 


1432 


98 


198 


gil2805189 


Mus musculus 


capping protein (actin filament) muscle 


1318 


92 
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Z-line ? beta 






199 


gil4017787 


Homo sapiens 


mRNA for KIAA1785 protein, partial 
cds. 


3195 


100 


199 


gil3436428 


Homo sapiens 


Similar to feminization 1 a homolog 
(C. elegans), clone MGC:4216 
IMAGE:2957950, mRNA, complete 
cds. 


2197 


64 


199 


gil2836689 


Mus musculus 


putative 


2164 


65 


200 


gi7959811 


Homo sapiens 


PR01167 


389 


100 


200 


gi2736345 


Caenorhabditis 
elegans 


contains similarity to G-coupled protein 
receptors 


69 


33 


200 


gi7504953 


Caenorhabditis 
elegans 


hypothetical protein H22D07.1 - 
Caenorhabditis elegans > 


69 


33 


201 


gil2697975 


Homo sapiens 


mRNA for KIAA1715 protein, partial 
cds. 


2230 


100 


201 


AAB42461 


Homo sapiens 


Human ORFX ORF2225 polypeptide 
sequence SEQ ID NO:4450. 


1015 


100 


201 


gil2844031 


Mus musculus 


putative 


567 


92 


202 


gi7296176 


Drosophila 
melanogaster 


CG2839 gene product 


195 


27 


202 


gil 0438900 


Homo sapiens 


cDNA: FU22490 fis, clone 
HRC10983. 


184 


97 


202 


gi5824430 


Caenorhabditis 
elegans 


cDNA EST yk501h2.5 comes from this 
gene-cDNA EST yk523d4.5 comes 
from this gene-cDNA EST yk553fb\5 
comes from this gene-cDNA EST 
yk595gl2.5 comes from this 
gene~cDNA EST yk606gl0.5 comes 
from this gene-cDNA EST yk652B.5 
comes from this gene 


182 


21 


203 


AAM00957 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 433. 


1725 


100 


203 


gi4151807 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


1484 


62 


203 


gi4151805 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 1 Maguin-1 


1484 


62 


204 


AAM00844 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 207. 


1051 


98 


204 


gi4151807 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


779 


69 


204 


gi4151805 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 1 Maguin-1 


779 


69 


205 


AAM00957 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 433. 


1576 


92 


205 


gi4151807 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 2 Maguin-2 


1349 


57 


205 


gi4151805 


Rattus 
norvegicus 


membrane-associated guanylate kinase- 
interacting protein 1 Maguin-1 


1349 


57 ■ 


206 


gi7242969 


Homo sapiens 


mRNA for KIAA1307 protein, partial 
cds. 


8582 


99 


206 


AAM00860 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 223. 


4841 


98 


206 


gi4426611 


Drosophila 


pushover 


2137 


46 
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melanogaster 








207 


AAB62210 


Homo sapiens 


Human ABCA2 transporter protein. 


9835 


99 


207 


gil3173186 


Homo sapiens 


ABC transporter ABCA2 (ABCA2) 
mRNA, complete cds. 


9835 


99 


207 


gi9957467 


Homo sapiens 


ATP-binding cassette sub-family A 
member 2 (ABCA2) mRNA, complete 
cds. 


9835 


99 


208 


AAB94358 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14883. 


2268 


99 


208 


gil0434632 


Homo sapiens 


cDNAFU12886fis f clone 
NT2RP2004041, weakly similar to 
SYNAPSINS IA AND IB. 


2268 


99 


208 


gil2052738 


Homo sapiens 


mRNA; cDNA DKFZp564H1322 
(from clone DKF2p564H1322); 
complete cds. 


2268 


99 


209 


gil4627122 


Homo sapiens 


Human DNA sequence from clone 
RP4-583P15 on chromosome 20 
Contains ESTs, STSs, GSSs and ten 
CpG islands. Contains the TNFRSF6B 
gene for tumor necrosis factor receptor 
6b (decoy), the 3' part of the 
KIAA1088 gene, the ARFRP1 gene for 
ADP-ribosylation factor related protein 
1, two genes for novel proteins, the 
gene for a GLUT4 enhancer factor and 
the gene for a novel zinc finger protein 
similar to rat RIN ZF and the gene for a 
novel BTB/POZ domain containing 
zinc finger protein, complete sequence. 


2074 


99 


209 


gil3162677 


Homo sapiens 


GLUT4 enhancer factor mRNA, 
complete cds. 


2055 


98 


209 


gil2655101 


Homo sapiens 


clone IMAGE:3140406, mRNA, 
partial cds. 


1766 


100 


210 


gi!4279329 


Homo sapiens 


ubiqiutin specific protease (USP28) 
mRNA, complete cds. 


4131 


92 


210 


gi7959297 


Homo sapiens 


mRNA forKIAA1515 protein, partial 
cds. 


3872 


100 


210 


AAB31552 


Homo sapiens 


A human ubiquitin specific protease 25 
(USP25). 


2058 


48 


211 


AAB36579 


Homo sapiens 


Human FLEXHT-1 protein sequence 
SEQIDNOrl. 


1829 


100 . 


211 


AAB94048 


Homo sapiens 


Human protein sequence SEQ ID 
NO:1421i. 


1825 


99 


211 


gil0433984 


Homo sapiens 


cDNA FU12475 fis, clone 
NT2RM1000962. 


1825 


99 


212 


gil5824499 


Homo sapiens 


GaINAc-4-0 -sulfotransferase 1 
mRNA, complete cds. 


2238 


100 


212 


gil 1990885 


Homo sapiens 


GaINAc4ST mRNA for GalNAc 4- 
sulfotransferase, complete cds. 


2238 


100 


212 


gi!5559803 


Homo sapiens 


carbohydrate (N-acetylgalactosamine 
4-0) sulfotransferase 8, clone 
MGC:20987 IMAGE:4635405, mRNA, 
complete cds. 


2238 


100 
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213 


AAB43387 


Homo sapiens 


Human ORFX ORF3151 polypeptide 
sequence SEQ ID NO:6302. 


1056 


100 


213 


gil5292317 


Drosophila 
melanogaster 


LD46863p 


549 


50 


213 


gi7302029 


Drosophila 
melanogaster 


CG12054 gene product 


549 


50 


214 


gil2843216 


Mus musculus 


putative 


913 


84 


214 


gil4585867 


Homo sapiens 


hypothetical protein SB 145 


297 


44 


214 


gil4388386 


Macaca 
fascicularis 


hypothetical protein 


295 


44 


215 


gi!4133219 


Homo sapiens 


mRNA for KIAA0833 protein, partial 
cds. 


7195 


99 


215 


gi6580410 


Homo sapiens 


Human DNA sequence from clone 
RP3-467L1 on chromosome lp36.21- 
36.33. Contains the 3' part of gene 
KIAA0833, the VAMP3 gene for 
vesicle-associated membrane protein 3 
(cellubrevin), the PER3 gene for period 
(Drosophila) homolog 3 and the gene 
for urotensin H. Contains two putative 
CpG islands, ESTs, STSs and GSSs, 
complete sequence. 


3642 


99 


215 


AAB42729 


Homo sapiens 


Human ORFX ORF2493 polypeptide 
sequence SEQ ED NO:4986. 


997 


54 


216 


gi7293088 


Drosophila 
melanogaster 


CG9213 gene product 


811 


30 


216 


gil5810333 


Arabidopsis 
thaliana 


unknown protein 


713 


28 


216 


gil3324888 


Caenorhabditis 
elegans 


Hypothetical protein B0361 .2 


710 


34 


217 


gi2443331 


Xenopus 
laevis 


Nfrl 


2421 ; 


75 


217 


AAB34944 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 20 SEQ ID NO: 148. 


1129 


91 


217 


gil5292543 


Drosophila 
melanogaster 


SD06560p 


911 


36 


218 


gi7243111 


Homo sapiens 


mRNA for KIAA1365 protein, partial 
cds. 


3855 


100 


218 


gil657758 


Rattus 
norvegicus 


densin-180 


3640 


93 


218 


gi8570180 


Rattus 
norvegicus 


densin-1 80 variant D 


1250 


83 


219 


gi!4017839 


Homo sapiens 


mRNA for KIAA181 1 protein, partial 
cds. 


1726 


80 j 


219 


gi3217028 


Homo sapiens 


mRNA for putative serme/threonine 
protein kinase, partial. 


1450 


84 


219 


gi7294217 


Drosophila 
melanogaster 


CG6 114 gene product 


1055 


70 


220 


gi7297674 


Drosophila 
melanogaster 


CG13139 gene product 


942 


75 


220 


gil2857050 


Mus musculus 


putative 


767 


62 


220 


gil5636900 


Gallus gallus 


avEna neural variant 


139 


52 


221 


gil5489242 


Homo sapiens 


clone IMAGE:3859726, mRNA, 


1001 


88 
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partial cds. 






221 


gil3543991 


Homo sapiens 


clone IMAGE:3627860, mRNA, 
^partial cds. 


1001 


88 


221 


gil2847182 


Mus musculus 


putative 


328 


39 


222 


gil4133209 


Homo sapiens 


mRNA for KIAA0654 protein, partial 
cds. 


6089 


99 


222 


gi930343 


Homo sapiens 


Human LAR-interacting protein lb 
mRNA, complete cds. 


3559 


60 


222 


gi930341 


Homo sapiens 


Human LAR-interacting protein la 
mRNA, complete cds. 


3503 


60 


223 


gil2620207 


Homo sapiens 


Clorf25 mRNA, complete cds. 


3807 


98 


223 


gi9588430 


Homo sapiens 


Human DNA sequence from clone 
GS1-120K12 on chromosome lq25.3- 
3 1 .2, Contains the gene for ring finger 
protein DING or BAP-1, an FTH1 
(ferritin, heavy polypeptide 1) 
pseudogene, the 3* end of the gene for a 
novel protein similar to archaeal, yeast 
and wormN2,N2-dimethylguanosine 
tRNA methyltransferase, ESTs, STSs, 
GSSs and two putative CpG islands, 
complete sequence. 


2300 


98 


223 


gi!2835704 


Mus musculus 


putative 


1420 


88 


224 


gil4595658 


Xenopus 
laevis 


UM protein prickle 


2865 


67 


224 


gil0727796 


Drosophila 
melanogaster 


esn gene product 


698 


42 


224 


gi6634092 


Drosophila 
melanogaster 


LIM-domain protein 


698 


42 


225 


gil3375149 


Homo sapiens 


Human DNA sequence from clone 
RP5-1 1 18M15 on chromosome 20 
Contains part of a gene similar to P14 
Bos taurus (P14L), a novel gene, ESTs, 
STSs, GSSs and a CpG Island, 
complete sequence. 


957 


99 


225 


gi7259265 


Mus musculus 


contains transmembrane (TM) region 


314 


50 


225 


AAY53871 


Homo sapiens 


A human brain-derived signalling 
factor polypeptide. ! 


299 


45 


226 


gil2803987 


Homo sapiens 


clone MGC:4174 IMAGE:3634226, 
mRNA, complete cds. 


743 


100 


226 


gil2805417 


Mus musculus 


Unknown (protein for MGC:7354) 


444 


66 


226 


gil2849498 


Mus musculus 


putative 


235 


72 


227 


AAY91629 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 23 SEQ ID NO:302. 


1391 


87 


227 


gi7677403 


Homo sapiens 


F-box protein FBG2 (FBG2) mRNA, 
complete cds. 


1391 


87 


227 


AAY83046 


Homo sapiens 


F-box protein FBP-6. 


1333 


82 


228 


gil 5079958 


Homo sapiens 


chromosome 1 1 open reading frame 
24, clone MGC:19741 
IMAGE:3614861, mRNA, complete 
cds. 


2231 


99 ! 


228 


gil 1527205 


Homo sapiens 


DM4E3 (CI lor£24) mRNA, complete 
cds. 


2224 


99 
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228 


AAB 18965 


Homo sapiens 


Amino acid sequence of a human 
transmembrane protein. 


2055 


99 


229 


gil5930199 


Homo sapiens 


Similar to RIKEN cDNA 4921523118 
gene, clone MGC:9467 
MAGE:3914747, mRNA, complete 
cds. 


1451 


99 


229 


gil3278594 


Mus musculus 


RIKEN cDNA 4921523118 gene 


1440 


97 


229 


gil2856904 


Mus musculus 


putative 


1440 


97 


230 


gil5680131 


Homo sapiens 


hypothetical protein FU12171, clone 
MGC: 19889 IMAGE:4652087, mRNA, 
complete cds. 


1638 


100 


230 


gil4043242 


Homo sapiens 


hypothetical protein FU 12171, clone 
MGC:15694 IMAGE:3351601, mRNA, 
complete cds. 


1638 


100 


230 


AAB93912 


Homo sapiens 


Human protein sequence SEQ ED 
NO: 13880. 


1634 


99 


231 


AAB56947 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO: 1525. 


779 


100 


231 


AAB68408 


Homo sapiens 


Amino acid sequence of a human 
NOV1 polypeptide. 


574 


100 


231 


AAY81695 


Homo sapiens 


Human PTN protein sequence. 


574 


100 


232 


gill 138034 


Homo sapiens 


mRNA for KIAA1 173 protein, 
complete cds. 


2665 


100 


232 


AAG89259 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
379. 


2654 


99 


232 


gil2834372 


Mus musculus 


jnitative 


2427 


90 


233 


AAB98612 


Homo sapiens 


Human tumour suppressor gene, 
TSG16, protein. 


1706 


55 


233 


gil 1596412 


Homo sapiens 


GAC-1 (GAC-1) mRNA, complete cds. 


893 


77 


233 


gi4240237 


Homo sapiens 


mRNA for KIAA0874 protein, partial 
cds. 


893 


77 


234 


AAB41108 


Homo sapiens 


Human ORFX ORF872 polypeptide 
sequence SEQ ID NO:1744. 


4170 


99 


234 


gi6331287 


Homo sapiens 


mRNA for K1AA1274 protein, partial 
cds. 


3936 


99 


234 


gil545959 


Mus musculus 


paladin 


3560 


80 


235 


gi9368849 


Homo sapiens 


mRNA; cDNA DKFZp761G21 13 
(from clone DKFZp761G2113). 


972 


99 


235 


gi7293878 


Drosophila 
melanogaster 


CG13379 gene product 


274 


36 


235 


gil4532482 


Arabidopsis 
thaliana 


AT5g58570/mznl_20 


152 


31 


236 


gi3242242 


Mus musculus 


hyperpolarization-activated cation 
channel, HAC2 


4309 


91 


236 


gi7407645 


Rattus 
norvegicus 


hyperpolarization-activated, cyclic 
nucleotide-gated potassium channel 1 


4306 


91 


236 


gi2708316 


Mus musculus 


brain cyclic nucleotide gated 1; Bcng- 
1 ; brain specific ion channel protein 


4301 


91 


237 


AAB13370 


Homo sapiens 


Human brain-associated protein 
HBAP-1. 


1055 


100 


237 


gi9944291 


Homo sapiens 


TTYH1 mRNA, complete cds. 


1055 


100 


237 


gi9651109 


Macaca 

fascicularis ' 


TTYH1 


1032 


98 
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238 


AAU00476 


Homo sapiens 


Human INTERCEPT 400 protein. 


1428 


100 


238 


AAY79266 


Homo sapiens 


Human elongase homologue HS3. 


1428 


100 


238 


AAB29648 


Homo sapiens 


Human membrane-associated protein 
HUMAP-5. 


1428 


100 


239 


AAB84885 


Homo sapiens 


Human protein, SEQ ID 14. 


4029 


99 


239 


AAB84882 


Homo sapiens 


Human protein, SEQ ID 6. 


4029 


99 


239 


gi5262593 


Homo sapiens 


mRNA; cDNA DKFZp434N093 (from 
clone DKFZp434N093); partial cds. 


3684 


99 


240 


gi!3477247 


Homo sapiens 


Similar to RIKEN cDNA 
5031400M07 gene, clone MGC: 13079 
LMAGE:3840918, mRNA, complete 

cds. 


2153 


100 


240 


AAB18987 


Homo sapiens 


Amino acid sequence of a human 
transmembrane protein. 


2148 


99 


240 


gi7670425 


Mus musculus 


unnamed protein product 


1904 


89 


241 


AAG63222 


Homo sapiens 


Amino acid sequence of a human lipid 
metabolism enzyme. 


2194 


100 


241 


gil4861069 


Mus musculus 


phosphatidyl inositol phosphate kinase 
type II gamma 


2120 


95 


241 


gi3387798 


Rattus 
norvegicus 


phosphatidylinositol 5-phosphate 4- 
kinase gamma 


2087 


95 


242 


gi7295732 


Drosophila 
melanogaster 


ft gene product 


2915 


39 


242. 


gil57409 


Drosophila 
melanogaster 


fat protein 


2901 


39 


242 


gil0727403 


Drosophila 
melanogaster 


ds gene product 


2236 


34 


243 


AAF90315 aa 
2 


Homo sapiens 


Winged helix/zinc ringer transcription 
fectorFOXPlcDNA. 


819 


98 


243 


AAB82339 


Homo sapiens 


Winged helix/zinc finger transcription 
factor FOXP1. 


819 


98 


243 


gil2043714 


Homo sapiens 


clone pAB195 FOXP1 (FOXP1) 
mRNA, complete cds. 


819 


98 


244 


gil0440073 


Homo sapiens 


cDNA: FLJ23399 fis, clone HEP 18254. 


2620 


100 


244 


gi7018524 


Homo sapiens 


mRNA; cDNA DKFZp762K137 (from 
clone DKFZp762K137); partial cds. 


2524 


100 


244 


gil4133227 


Homo sapiens 


mRNA for KIAA0970 protein, partial 
cds. 


1367 


51 


245 


AAB94855 


Homo sapiens 


Human protein sequence SEQ ID 
NO:16042. 


1347 


100 


245 


gil0436290 


Homo sapiens 


cDNA FU13968 fis, clone 
Y79AA1001493, weakly similar to 
UBIQUITIN-CON JUGATING 
ENZYME E2-17 KD 9 (EC 6.3.2.19). 


1347 


100 


245 


gil6198439 


Homo sapiens 


hypothetical protein FU13855, clone 
MGC:16842 IMAGE:39 15698, mRNA, 
complete cds. 


1347 


100 


246 


gi6330302 


Homo sapiens 


mRNA for KIAA 1 1 85 protein, partial 
cds. 


2041 


100 


246 


AAG74603 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5367. 


1530 


97 


246 


AAB53321 


Homo sapiens 


Human colon cancer antigen protein 
sequence SEQ ID NO:861. 


1530 


97 



163 



WO 02/081731 



PCT/US02/01222 



Table 2 



SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% ! 
Identity 


247 


gi535390 


Macronuclear 
Homo sapiens 


Human cellular retinol binding protein 
II (CRBPII) rnRNA, complete cds. 


715 


99 


247 


gi397352 


Mus musculus 


mCRBPn 


674 


91 


247 


gil2833902 


Mus musculus 


putative 


669 


90 


248 


AAG01285 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
5366. 


209 


87 


248 


AAR05562 


Homo sapiens 


Laminin -binding protein encoded by 
insert from J9 lambda gtlO phage. 


209 


87 


248 


gil 149509 


Gallus gallus 


37kD Laminin receptor precursor /p40 
ribosomal associated protein 


209 


87 


249 


gi!3162226 


Homo sapiens 


Human DNA sequence from clone 
RP4-543J19 on chromosome 20 
Contains part of the GNAS 1 gene 
encoding guanine nucleotide binding 
protein (G protein, alpha stimulating 
activity polypeptide 1) including 
neuroendocrine secretory protein 55 
ync&rjo), uie v> i oZi/\ gene cncoaing 
cathepsin Z, the ATP5E gene encoding 
ATP synthase (H+ transporting, 
mitochondrial Fl complex, epsilon 
subunit), the gene encoding protein 
HSPCI30 (TH1 Drosophila homolog), 
the gene for tubulin beta 1 class VI 
(TUBB1), a gene encoding the CGI- 
107 protein (LOC51012), four CpG 
islands, ESTs, STSs and GSSs, 
complete sequence. 


1591 


100 


249 


gil 1230445 


Homo sapiens 


TUBB1 gene for human beta tubulin 1, 
class VI. 


1591 


100 


249 


gi212834 


Gallus gallus 


beta-tubulin 


1340 


85 


250 


gil3 162226 


Homo sapiens 


Human DNA sequence from clone 
RP4-543 J19 on chromosome 20 
Contains part of the GNAS1 gene 
encoding guanine nucleotide binding 
pioieui. \vj proiein, aipna sumuiaung 
activity polypeptide 1) including 
ociuucuuu^ruic secretory proiem j j 
fNF^PSS^ the PT^7A optip pnmHina 

cathepsin Z, the ATP5E gene encoding 
ATP synthase (H+ transporting, 
mitochondrial Fl complex, epsilon 
subunit), the gene encoding protein 
HSPC130 (TH1 Drosophila homolog), 
the gene for tubulin beta 1 class VI 
(TUBB1), a gene encoding the CGI- 
107 protein (LOC51012), four CpG 
islands, ESTs, STSs and GSSs, 
complete sequence. 


1986 


100 


250 


gil 1230445 


Homo sapiens 


TUBB1 gene for human beta tubulin 1, 
class VI. 


1986 


100 


250 


gi212834 


Gallus gallus 


beta-tubulin 


1699 


85 


251 


gi559325 


Homo sapiens 


Human rnRNA for ATP synthase alpha 


1566 


99 
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subunit, complete cds. 






251 


gi559317 


Homo sapiens 


Himian gene for ATP synthase alpha 
subunit, complete cds (exon 1 to 12). 


1566 


99 


251 


gi34468 


Homo sapiens 


H.sapiens mRNA for mitochondrial 
ATP synthase. 


1566 


99 


252 


gi559325 


Homo sapiens 


Human mRNA for ATP synthase alpha 
subunit, complete cds. 


2192 


84 


252 


gi559317 


Homo sapiens 


Human gene for ATP synthase alpha 
subunit, complete cds (exon 1 to 12). 


2192 


84 


252 


gi34468 


Homo sapiens 


H.sapiens mRNA for mitochondrial 
ATP synthase. 


2192 


84 


253 


gil4550508 


Homo sapiens 


Similar to CG8974 gene product, clone 
MGC:2460 IMAGE:2964524, mRNA, 
complete cds. 


1051 


100 


253 


gil5928691 


Mus musculus 


Unknown (protein for MGC: 19394) 


1036 


98 


253 


gi7293133 . 


Drosophila 
melanogaster 


CG8974 gene product 


608 


66 


254 


AAE04880 


Homo sapiens 


Human protease protein-7 (PRTS-7). 


2795 


100 


254 


gil4043577 


Homo sapiens 


hypothetical protein FLJ12455, clone 
MGC: 13 149 HvtAGE:4298740, mRNA, 
complete cds. 


2795 


100 


254 


AAB94023 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14157. 


2781 


99 


255 


gi2501855 


Homo sapiens 


22 kDa actin-binding protein (SM22) 
gene, complete cds. 


937 


95 


255 


gi2340833 


Homo sapiens 


DNA for SM22 alpha, complete cds. 


937 


95 


255 


gi2335047 


Homo sapiens 


mRNA for SM22 alpha, complete cds. 


937 


95 


256 


ei 15080204 


T-Tr>7nr> canipnc 

X XKJLHSJ PdUibUp 


cimilflf tn «rr»l r nTi//iri/" , _fiflr» rAstcc T 
animal iir pxuiUtiyuLiv-iypC viaoa X 

peptide chain release factors, clone 
MGC:20261 DV1AGE:3029407, mRNA, 
complete cds. 




00 


256 


ci6706658 


Homo sanier><; 


rilimsiTi "D^vJ A cpnnpnrp frftm pi tip 
xxuuiaxi uiy^i. 5Ci^uciiL-c xiiJiii civile 

RP1-101K10 on chromosome 6q25-26. 
Contains a novel gene, the gene for a 
novel protein similar to Prokaryotic- 

factors, the 3* end of gene RGS17 
(RGSZ2) for regulator of G-protein 
signaling 17, ESTs, STSs, GSSs and 
two putative CpG islands, complete 
sequence. 


iy*t\i 


00 

yy 


256 


gil5680165 


Homo sapiens 


similar to prokaryotic-type class I 
peptide chain release factors, clone 
MGC:20252 IMAGE:4646472, mRNA, 
complete cds. 


1375 


98 


257 


gil5080204 


Homo sapiens 


similar to prokaryotic-type class I 
peptide chain release factors, clone 
MGC:20261 IMAGE:3 029407, rriRNA, 
complete cds. 


1706 


90 


257 


gi6706658 


Homo sapiens 


Human DNA sequence from clone 
RP1-101K10 on chromosome 6q25-26. 
Contains a novel gene, the gene for a 
novel protein similar to Prokaryotic- 


1698 


89 



165 



WO 02/081731 



PCT/US02/01222 



Table 2 



SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% 
Identity 








type class I peptide chain release 
factors, the 3 1 end of gene RGS17 
(RGSZ2) for regulator of G-protein 
signaling 17, ESTs, STSs, GSSs and 
two putative CpG islands, complete 
sequence. 






257 


gil5680165 


Homo sapiens 


similar to prokaryotic-type class I 
peptide chain release factors, clone 
MGC:20252 IMAGE:4646472, mRNA, 
complete cds. 


1133 


85 


258 


gi7295482 


Drosophila 
melanogaster 


CG4603 gene product 


616 


41 


258 


gil2322327 


Arabidopsis 
thaliana 


unknown protein 


451 


46 


258 


gi9454545 


Arabidopsis 
thaliana 


Unknown protein 


451 


46 


259 


AAB95307 


Homo sapiens 


Human protein sequence SEQ TP 
NO: 17548. 


5011 


100 


259 


gil4042477 


Homo sapiens 


cDNA FLJ14740 fis, clone 
NT2RP3002602, weakly similar to 
PROBABLE PROTEIN DISULFIDE 
ISOMERASE ER-60 PRECURSOR 
(EC 5.3.4.1). 


5011 


100 


259 


gil5862252 . 


Homo sapiens 


unnamed protein product 


5008 


99 


260 


gil5079416 


Homo sapiens 


secreted modular calcium-binding 
protein 1, clone MGC: 19895 
IMAGE:4549051, mRNA, complete 
cds. 


2359 


100 


260 


AAB 19394 


Homo sapiens 


Amino acid sequence of a human 
secreted protein. 


2355 


99 


260 


gil0432431 


Homo sapiens 


mRNA for secreted modular calcium- 
binding protein (smocl gene). 


2343 


99 


261 


gi7020475 


Homo sapiens 


cDNA FU20400 fis, clone KAT00587. 


1687 


100 


261 


gill 18097 


Caenorhabditis 
elegans 


proline and glycine-rich 


268 


33 


261 


AAW49723 


Homo sapiens 


Protein polymer adhesive substrate 
PPAS1-F. 


261 


32 


262 


gil6197949 


Drosophila 
melanogaster 


LD21896p 


325 


29 


262 


gi7293303 


Drosophila 
melanogaster 


CG9089 gene product 


325 


29 


262 


gi3170539 


Takifugu 
rubripes 


unknown 


291 


40 


263 


AAB42525 


Homo sapiens 


Human ORFX ORF2289 polypeptide 
sequence SEQ ID NO:4578. 


3570 


80 


263 


gi2887497 


Homo sapiens 


chromosome 19, overlapping cosmids 
R28707 and R34001, complete 
sequence. 


3570 


80 


263 


AAB42538 


Homo sapiens 


Human ORFX ORF2302 polypeptide 
sequence SEQ ID NO.4604. 


2835 


99 


264 


gil4017849 


Homo sapiens 


mRNA for KIAA1816 protein, partial 
cds. 


1637 


99 


264 


gi8655687 


Homo sapiens 


mRNA; cDNA DKFZp762E151 1 


892 


100 
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(from clone DKFZp762E1511). 






264 


gi6979930 


Homo sapiens 


Maml mRNA, partial cds. 


315 


30 


265 


gil2836420 


Mus musculus 


putative 


2511 


93 


265 


gil0437002 


Homo sapiens 


cDNA: FLJ21013 fls, clone 
CAE05223. 


1859 


99 


265 


AAB58385 


Homo sapiens 


Lung cancer associated polypeptide 
sequence SEQ ID 723. 


1704 


99 


266 


gil419832i 


Mus musculus 


ribosomal protein L3 1 


543 


92 


266 


&51U5 


Rattus 
norvegicus 


ribosomal protein L31 (AA 1-125) 


543 


92 


266 


gil4586963 


Mus musculus 


M75 


543 


92 


267 


gil78424 


Homo sapiens 


Human apolipoprotein A-II mRNA, 
complete cds. 


478 


96 


267 


gi296634 


Homo sapiens 


Human gene for apolipoprotein AH. 


478 


96 


267 


ci296633 


Hnmn sanien*; 


XlUlllritl XJLVIX lOl apoiipoprOLCm A-JU. 


H/O 




268 


AAB47184 


Homo sanien^ 


fWsi JLu\. piUlClli dCljUCilCC. 


JJ / 1 


1 AA 


268 


gi7321168 


Homo sapiens 


Human DNA sequence from clone 
ivr j-ouuri? uu ciiiomobuuie z.upiz.o- 
13 Contains the eene for KTAA 1 44? 
(similar to olfactory neuronal 
transcription factors (COE1, COE2, 
COE3, EBF3, OLF1)), RPL19 (60S 
ribosomal protein LI 9) and HSPC080 
pseudogenes, the gene for 
metallocarboxypeptidase (CPX-1) and 
a novel gene. Contains ESTs, STSs, 
GSSs and four CpG islands, complete 
sequence. 


3571 


100 


268 


AAB36174 


Homo sapiens 


Human APG04 protein. 


3567 


99 


269 


gi23 14829 


Homo sapiens 


jerky gene product homolog mRNA, 
complete cds. 


1430 


59 


269 


gil0140857 


Mus musculus 


jerky 1 


752 


33 


269 


AAG62624 


Homo sapiens 


Human cell nucleus regulatory protein 
56. 


598 


34 


270 


gi7959227 


Homo sapiens 


mRNA for KIAA1483 protein, partial 
cds. 


2231 


99 


270 


gi34192 


Homo sapiens 


Human KUP mRNA for protein with 
two zinc fingers. 


627 


39 


270 


gil33 10782 


Mus musculus 


myoneurin 


315 


24 


271 


AAB93814 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 13604. 


1408 


97 


271 


gil0433080 


Homo sapiens 


cDNA FIJI 1753 fis, clone 
HEMBA1005583. 


1408 


97 


271 


AAB41771 


Homo sapiens 


Human ORFX ORF1 535 polypeptide 
sequence SEQ ID NO:3070. 


821 


99 


272 


gi7959197 


Homo sapiens 


mRNA for KIAA1468 protein, partial 
cds. 


4603 


100 


272 


gil5080502 


Homo sapiens 


clone MGC: 16944 LMAGE:4339646, 
mRNA, complete cds. 


4317 


94 


272 


gi9755831 


Arabidopsis 


putative protein 


675 


27 


273 


gil5080502 


Homo sapiens 


clone MGC: 1 6944 JMAGE:4339646, 
mRNA, complete cds. 


4362 


98 
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273 


gi7959i97 


Homo sapiens 


mRNA for KIAA1468 protein, partial 
cds. 


4360 


96 


273 


gi9755831 


Arabidopsis 
tha liana 


putative protein 


704 


28 


274 


AAB92483 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 10570. 


2626 


100 


274 


gi7021875 


Homo sapiens 


cDNA FU1005 1 fis, clone 
HEMBA1 001281. 


2626 


100 


274 


gil2837616 


Mus musculus 


putative 


2065 


90 


275 


gil 07 16076 


Homo saniens 


mRNA for testis-ahundant finder 
protein, complete cds. 


2739 


100 


275 


cil4043332 


Homo sanien<3 


Similar to rinf* fincer nrotpin *)% rlnnp 
MGC:2475 1MAGE:3051389, mRNA, 
complete cds. 




04 


275 


gil07 16078 


Mus musculus 


testi^-ahundant finder orotetn 


2497 


92 


276 


AAB44673 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 33 SEQ ID NO:138. 


1014 


96 


276 


gil 747 


Oryctolagus 
cuniculus 


trichohyalin 


213 


22 


276 


gil3936996 


Human 
herpesvirus 8 


ORF73 


203 


22 


277 


AAG74326 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5090. 


1101 


100 


277 


AAB56461 


Homo sapiens 


Human prostate cancer antigen protein 
sequence SEQ ID NO: 1039. 


778 


100 


277 


gil2842930 


Mus musculus 


putative 


688 


90 


019. 
Z (0 


gliVZU14D 


Homo sapiens 


Human UNA binding protein (HPF2) 
mRNA, complete cds. 


1528 


47 


ZtfO 




Homo sapiens 


Human DNA sequence from clone 
RP1-54B20 on chromosome Xpl 1.1- 
1 i.j. contains me 2 enu 01 a novel 
SSX family protein gene, two novel 

EfP Aft tinv rrvntninino f^?W? Hmp Tins* 

finger protein genes, a KRAB box 

nrntpiti nQPiinncypTK 3 * tViP opnp Cnr a 

£/lUlGlU ^lOvUUUgbllC, LUC gvUC i*JL A 

novel protein similar to lysozyme C 

ZNF81 gene for zinc finger protein 81 
(HFZ20), ESTs, STSs, GSSs and three 
CpG islands, complete sequence. 


1497 




278 


gi498152 


Homo sapiens 


Human mRNA for KIAA0065 gene, 
partial cds. 


1495 


46 


279 


gi2914676 


Homo sapiens 


chromosome 16, cosmid clone 36QH6 
(LANL), complete sequence. 


882 


35 


279 


gil4250678 


Homo sapiens 


clone MGC: 10489 IMAGE:3945548, 
mRNA, complete cds. 


882 


35 


279 


gi2342506 


Homo sapiens 


mRNA for zinc finger protein FPM3 15, 
complete cds. 


875 


35 


280 


gi434779 


Homo sapiens 


Human mRNA for KIAA01 12 gene, 
partial cds. 


2072 


100 


280 


gil5278392 


Homo sapiens 


homolog of yeast ribosome biogenesis 
regulatory protein RRS1, clone 
MGC.-4831 IMAGE:3603972,mRNA, 


1905 


100 
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complete cds. 






280 


gil2804751 


Homo sapiens 


Similar to regulator for ribosome 

resistance homoln? (f± cerevi^iae^ 

clone MGC:2755 IMAGE:2824034, 
mRNA, complete cds. 


1905 


100 


281 


AAB95761 


Homo saoiens 


Human nrotein seaupnee SRO TO 
NO: 18686. 


/ 07 


mo 


281 


AAG81272 


Homo sapiens 


Human AFP protein sequence SEQ ID 
NO:62. 


789 


100 


281 


gil4035852 


Homo sapiens 


unnamed protein product 


789 


100 


282 


ei 150809 11 


XXKJXLUJ OOjpJCJXo 


iicu-puiy^/\^ pojyuierabe rnrvjN/v, 
complete cds. 


11Q1 

j I? 1 


QQ 

yy 


282 


gil5384858 


Homo sapiens 


mRNA for poly(A) polymerase gamma 

\xT r\JT VJJ_AJ gcllc 1. 


3797 


99 


282 


gil3641252 


Homo sapiens 


SRP RNA 3' adenylating enzyme/pap2 
mRNA, complete cds. 


3779 


99 


283 




T-Tnmr» csinif^ric 
lXUUiU ooplCLLa 


iruvlNA, CUINA Ux\t /jp^j^P^lyj 

(from clone DKFZp434A1014); partial 
cds. 


1 All 

143 / 


Of 


283 


gil2853788 


Mus musculus 


putative 


408 


38 


283 


ei4468790 


y\\^xj\jxj uo 

laevis 


oyccuy piuicm. 


ID** 


zo 


284 


gi3327062 


Homo saniens 

AaVaXIw OWLS AX/ Ui) 


mRNA for KTA A0694 nrntpin nartial 

cds. 


10170 

iy 


00 

yy 


284 


gi 13702612 


Staphylococcu 
s aureus subsp. 
aureus N3 15 


ORFID:SA2447-hypothetical protein, 
similar to streptococcal hemagglutinin 
protein 


223 


19 


284 


gil4248429 


Staphylococcu 
s aureus subsp. 
aureus Mu50 


hypothetical protein 


223 


19 


285 


gil2697941 


Homo sapiens 


mRNA for KIAA1698 protein, partial 


4716 


100 


285 


gi7299794 


Drosophila 
melanogaster 


CG9591 gene product 


290 


31 


285 




nomo sapiens 


Natural killer lytic associated protein. 


92 


An 

40 


286 


AAG62395 


Homo sapiens 


Human zinc finger protein 46. 


2375 


100 


286 




nomo sapiens 


Human DNA sequence from clone 
RP1 1-393J16 on chromosome 10. 
i_/Ouiairis pan 01 me ZaNx* j gene lor 
zinc finger protein 33a (KOX 3 1), a 
novel gene for a novel KRAB box 
containing zinc finger gene, a zinc 
finger pseudogene, ESTs, STSs, GSSs 
and two putative CpG islands, complete 
sequence. 


2015 


100 


286 


gi881564 


Homo sapiens 


Human zinc finger containing protein 
ZNF157 (ZNF157) mRNA, complete 
cds. 


1339 


51 


287 


gi2822143 


Homo sapiens 


chromosome 19, cosmid R30217, 
complete sequence. 


1838 


53 


287 


gi9968290 


Homo sapiens 


mRNA for zinc finger protein (ZNF304 
gene). 


1735 


50 


287 


gil3543419 


Homo sapiens 


Similar to zinc finger protein 304, 


1735 


51 
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clone MGC.-4079 IMAGE.-3530863, 
mRNA, complete cds. 






288 


gi540469 


Homo sapiens 


(clone HGT26) T cell receptor gamma- 
chain mRNA, V region. 


399 


91 


288 


gi3047024 


Homo sapiens 


T-cell receptor gamma VI gene region. 


384 


100 


288 


gi339167 


Homo sapiens 


Human T-cell receptor rearranged 
gamma-chain gene V-region (V4) 
(subgroup I). 


384 


100 


289 


AAY69976 


Homo sapiens 


DHFR-HM protein. 


886 


93 


289 


gil 82724 


Homo sapiens 


Human dihydrofolate reductase gene. 


886 


93 


289 


gil82717 


Homo sapiens 


Human dihydrofolate reductase gene, 
exon 6 and 3' flank. 


886 


93 


290 


AAE01782 


Homo sapiens 


Human gene 13 encoded secreted 
protein HDPNW93, SEQ ID NO:103. 


4269 


99 


290 


gil0437433 


Homo sapiens 


cDNA: FLJ21347 fis, clone 
COL02724. 


4127 


97 


290 


AAB74693 


Homo sapiens 


Human protease and protease inhibitor 
PPM-26. 


3948 


99 


291 


gi6681662 


Mus musculus 


ENH3 


955 


90 


291 


gil2844277 


Mus musculus 


putative 


800 


79 


291 


AAY12510 


Homo sapiens 


Human 5* EST secreted protein SEQ ID 
NO:541. 


648 


99 


292 


AAB47327 


Homo sapiens 


FCTR4. 


2798 


98 


292 


gil5141735 


Homo sapiens 


unnamed protein product 


2798 


98 


292 


gi9663126 


Homo sapiens 


mRNA tor chromosome 12 open 
reading frame 3 (C12orf3). 


214 


24 


293 


gil0440367 


Homo sapiens 


mRNA for FU00018 protein, partial 
cds. 


5938 


100 


293 


gil5488570 


Homo sapiens 


Similar to hypothetical protein 
FLJ00018, clone MGC:10073 
IMAGE:3896004, mRNA, complete 
cds. 


4736 


99 


293 


&10438857 


Homo sapiens 


cDNA: FLJ22458 fis, clone 
HRC10001. 


1570 


99 


294 


AAB08948 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 2 1 SEQ ID NO: 105. 


1601 


99 


294 


AAB08911 


Homo sapiens 


Human secreted protein sequence 
encoded by gene 21 SEQ ID NO:68. 


1601 


99 


294 


AAB80238 


Homo sapiens 


Human PR0238 protein. 


641 


44 


295 


AAB18457 


Homo sapiens 


A human TANGO 216 polypeptide 
clone. 


2106 


98 


295 


AAB18447 


Homo sapiens 


Amino acid sequence of human 
TANGO 216 polypeptide. 


2106 


98 


295 


gil4017381 


Homo sapiens 


tumor endothelial marker 8 precursor 
(TEM8) mRNA, complete cds. 


1231 


57 


296 


gil4388342 


Macaca 
fascicularis 


hypothetical protein 


3833 


92 


296 


gi7243195 


Homo sapiens 


mRNA for KIAA1407 protein, partial 
cds. 


3817 


100 


296 


gil5451319 


Macaca 
fascicularis 


hypothetical protein 


2408 


91 


297 


gi7243039 


Homo sapiens 


mRNA for KIAA1329 protein, partial 
cds. 


4761 


100 
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297 


gil2007720 


Mus musculus 


VPS10 domain receptor protein 
SorCS2 


4466 


88 


297 


gi7715916 


Mus musculus 


SorCSb splice variant of the VPS10 
domain receptor SorCS 


2177 


47 


298 


AAM00812 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 175. 


1488 


99 


298 


gi!2846045 


Mus musculus 


putative 


1387 


65 


298 


AAM00925 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 401. 


996 


100 


299 


gi7298852 


Drosophila 
melanogaster 


CGI 0068 gene product 


609 


43 


299 


gi8655669 


Homo sapiens 


mRNA; cDNA DKFZp547C176 (from 
clone DKFZp547C176). 


482 


52 


299 


AAB42048 


Homo sapiens 


Human ORFX ORF1812 polypeptide 
sequence obQ ID JNU:3o24. 


325 


46 


300 


gil4043285 


Homo sapiens 


Similar to KIAA0808 gene product, 

J„_ fl 1 COO/1 TKA A /~^T7 .1 CIO 1 Cfl 

clone MCjC:1 jooO IMACrli:3j2S/159, 
mRNA, complete cds. 


1306 


97 


inn 


gl/ZOoyiZ 


Homo sapiens 


Human DN A sequence from clone 
RP5-1 108D1 1 on chromosome 20ql2- 
13.11 Contains part of the gene for a 
novel protein similar to C elegans 

X79(***1 7 -naff r»f tVi*» ap*ne* fr\r n nnvpl 
1 l . / , pail UI LUC gCHc lOi a uOVcl 

HMG (high mobility group) box 
protein similar to KIAA0737, 
KIAA0808 and TNRC9 fCAGF9^ 
ESTs, STSs, GSSs and two putative 
CpG islands, complete sequence. 


HAT 

797 


96 


300 


gi3882337 


Homo sapiens 


mRNA for KIAA0808 protein, 
complete cds. 


767 


55 


301 


gil5430292 


Homo sapiens 


muscle alpha-kinase (MAK) mRNA, 
complete cds. 


5445 


99 


301 


gi7243041 


Homo sapiens 


mRNA for KIAA1330 nrotein. nartial 
cds. 


4933 


100 

X \J\J 


301 


gil4331137 


Mus musculus 


mvocvtic mduction/difYerentiation 
originator 


3684 


72 


302 


gil4550508 


Homo sapiens 


Similar to CG8974 gene product, clone 
MGC:2460 IMAGE:2964524, mRNA, 
complete cds. 


589 


100 


302 


gil5928691 


Mus musculus 


Unknown (protein for MGC: 19394) 


574 


97 


302 


gi2564951 


Mus musculus 


unknown 


378 


72 


303 


gi7242955 


Homo sapiens 


mRNA for KIAA1300 protein, partial 
cds. 


9573 


99 


303 


gi6599162 


Homo sapiens 


mRNA; cDNA DKFZp434N1272 
(from clone DKFZp434N1272); partial 
cds. 


1392 


98 


303 


AAG75083 


Homo sapiens 


Human colon cancer antigen protein 
SEQIDNO:5847. 


628 


92 


304 


gi!408209 


Homo sapiens 


Human endogenous retrovirus HERV- 
K(HML6) proviral clone HML6.17 
putative polymerase and envelope 
genes, partial cds, and 3X111. 


398 


86 


304 


gi2801455 


Mouse 


Prl60 


176 


48 
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mammary 
tumor virus 








304 


gi6911288 


Exogenous 
mouse 
mammary 
tumor virus 


Gag-Pro-Pol 


176 


48 


305 


gil4269502 


Homo sapiens 


unconventional myosin 1G valine form 
(MYOIG) mRNA, MY01G-V allele, 
partial cds. 


3269 


98 


305 


gil4269504 


Homo sapiens 


unconventional myosin 1G methonine 
form (MYOIG) mRNA, MYOIG-M 
allele, partial cds. 


3266 


97 


305 


gi3724141 


Rattus 
norvegicus 


myosin I 


3130 


57 


306 


gi2145060 


Homo sapiens 


TTF-I interacting peptide 20 mRNA, 
partial cds. 


2081 


99 


306 


gi2224593 


Homo sapiens 


Human mRNA for KIAA0326 gene, 
partial cds. 


648 


39 


306 


gi488555 


Homo sapiens 


Human zinc finger protein ZNF 135 
mRNA, complete cds. 


590 


40 


307 


gil3183883 


Homo sapiens 


PD-l-ligand 2 protein (PDL2) mRNA, 
complete cds. 


1417 


99 


307 


gil3569410 


Homo sapiens 


butyrophilin precursor B7-DC mRNA, 
complete cds. 


1417 


99 


307 


AAE01352 


Homo sapiens 


Human gene 1 encoded secreted 
protein HDPPA04, SEQ ID NO:74. 


1416 


99 


308 


AAB87436 


Homo sapiens 


Human gene 22 encoded secreted 
protein fragment, SEQ ID NO: 177. 


383 


100 


308 


AAB94868 


Homo sapiens 


Human protein sequence SEQ ID 
NO:16072. | 


383 


100 


308 


gil0436314 


Homo sapiens 


cDNA FU13984 fis, clone 
Y79AA1001846. 


383 


100 


309 


AAY85025 


Homo sapiens 


Human Rap2 amino acid sequence. 


206 


33 


309 


gi4678734 


Homo sapiens 


Human gene from PACs 37M17 and 
305B16, chromosome X, similar to 
small G proteins, especially RAP-2A. 


206 


33 


309 


AAM00956 


Homo sapiens 


Human bone marrow protein, SEQ ID 
NO: 432. 


205 


32 


310 


gi36905 


Homo sapiens 


Human mRNA for T-cell receptor 
alpha-chain HAP50 V(a)8.2-J(a)M 


590 


100 


310 


gil223888 


synthetic 
construct 


T cell receptor alpha chain 


586 


100 


310 


gi2358036 


Homo sapiens 


T-cell receptor alpha delta locus from 
bases 250472 to 501670 (section 2 of 
5) of ike Complete Nucleotide 
Sequence. 


586 


100 


311 


AAE01596 


Homo sapiens 


Human gene 13 encoded secreted 
protein HCLCJ15, SEQ ID NO;146. 


1066 


92 


311 


AAE04136 


Homo sapiens 


Human gene 6 encoded secreted 
protein HCLBW50, SEQ ID NO: 123. 


1066 


92 


311 


gi31135 


Homo sapiens 


Hsapiens mRNA for elongation factor 
1-beta. 


1066 


92 


312 


gi7243137 


Homo sapiens 


mRNA for KIAA1378 protein, partial 


2400 


99 
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cds. 






312 


gi!23 14036 


Homo sapiens 


Human DNA sequence from clone 
RP3-383J4 on chromosome lq24.1- 
24.3 Contains part of a gene encoding a 
kelch motif containing protein, part of a 
novel gene encoding a protein similar 
to Aspartyl-TRNA synthetase, a 
putative novel gene, a 40S ribosomal 
protein S27 (RPS27) pseudogene, 2 
CpG islands, ESTs, STSs and GSSs, 
complete sequence. 


1184 


44 


312 


gi4650844 


Homo sapiens 


mRNA for Kelch motif containing 
protein, complete cds. 


1176 


44 


313 


gi7019945 


Homo sapiens 


cDNA FU20079 rls, clone COL03057. 


1610 


83 


313 


gil2804721 


Homo sapiens 


clone MGC:2663 IMAGE:3543910, 
mRNA, complete cds. 


1271 


48 


313 


AAB43912 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1357. 


1255 


45 


314 


AAB41414 


Homo sapiens 


Human ORFX ORF1 178 polypeptide 
sequence SEQ ID NO:2356. 


5094 


97 


314 


gi6329897 


Homo sapiens 


mRNA for KIAA1 137 protein, partial 
cds. 


4798 


98 


314 


gil4043759 


Homo sapiens 


clone IMAGE:41 1 1596, mRNA, 
partial cds. 


3906 


98 


315 


AAB28375 


Homo sapiens 


Human hyperpolarisation-activated 
channel HAC3. 


3686 


99 


315 


gi7959337 


Homo sapiens 


mRNA for KIAA1535 protein, partial 
cds. 


3665 


99 


315 


gi3242244 


Mus musculus 


hyperpolarization-activated cation 
channel, HAC3 


3556 


96 


316 


gil4198399 


Mus musculus 


RIKEN cDNA 1500034J20 gene 


837 


93 


316 


gil2854536 


Mus musculus 


putative 


837 


93 


316 


gil4250857 


Homo sapiens 


Human DNA sequence from clone 
RP5-1 137017 on chromosome 1 Id 12- 
14.2 Contains part of a gene similar to 
putative mitochondnalninner 
membrane protease subnunit 2, a novel 
mRNA, ESTs and GSSs, complete 
sequence. 


775 ; 


100 


317 


gil 0439850 


Homo sapiens 


cDNA: FLJ23233 fzs, clone 
CAS00458. 


1081 


50 


317 


gi9968290 


Homo sapiens 


mRNA for zinc finger protein (ZNF304 
gene). 


1039 


48 


317 


gil4249844 


Homo sapiens 


Similar to hypothetical protein 
FU23233, clone MGC: 14876 
IMAGE:3544044, mRNA, complete 
cds. 


1037 


47 


318 


gil 1863686 


Mus musculus 


neurobeachin 


3371 


96 


318 


gil 1863539 


Gallus gallus 


neurobeachin 


2100 


89 


318 


AAB92596 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 10843. 


1721 


100 


319 


gil2698174 


Macaca 
fascicuJaris 


hypothetical protein 


1221 


95 
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319 


gi!0439153 


Homo sapiens 


cDNA: FLJ22672 fis, clone HSI09265. 


1085 


99 


319 


gi7020125 


Homo sapiens 


cDNA FLJ20190 fis, clone COLF0714. 


893 


50 


320 


gi2865219 


Homo sapiens 


integrin binding protein Del-1 (Dell) 
mRNA, complete cds. 


447 


100 


320 


AAW94685 


Homo sapiens 


Human Del-1 protein. 


438 


98 


320 


AAW10365 


Homo sapiens 


Human developmentally-regulated 
endothelial cell locus-1 protein. 


438 


98 


321 


AAB27246 


Homo sapiens 


Human EXMAD-24 SEQ ID NO: 24. 


2047 


100 


321 


AAB42385 


Homo sapiens 


Human ORFX ORF2149 polypeptide 
seauence SEO ID N04298 


2047 


100 


321 


gi52998 


Mus musculus 


macrophage mannose receptor 
precursor 


164 


31 


322 


gil2834087 


Mus musculus 


putative 


1456 


82 


322 


gi2463628 


Homo sapiens 


Human putative monocarboxylate 
transporter (MCT) mRNA, complete 
cds. 


506 


*29 


322 


gi2198807 


Gallusgallus 


monocarboxylate transporter 3 


473 


27 


323 


gil5620909 


Homo sapiens 


mRNA for KIAA1925 protein, partial 

cds. 


1059 


38 


323 


AAB92496 


Homo sapiens 


Human protein sequence SEQ ID 
NO-10598 


1050 


36 


323 


gi7021900 


Homo sapiens 


cDNAFLJl 0065 fis clone 
HEMBA1001455. 


1050 




324 


gi9651075 


Macaca 
fascicularis 


unnamed orotein nroduct 


3716 




324 


gil5145795 


Sus scrofa 


basic oroline-rieh nrotein 


29? 




324 


gi5917666 


Zea mays 


extensin-like protein 


195 


25 


325 


ei7529597 




HllITlATl T")"NT A c,pnilf*nrf* firnm rlnnp 
xxiUlUHl UVM\. fitCl^UCUl^f? 11 U 111 1/lOUC 

RP3-402N21 on chromosome 6p21.1- 
21 31 {Contains im to three novel eerie q 
with MAM and immunoglobulin 
domains. Contains ESTs, STSs, GSSs 
and four putative CpG islands, 
complete sequence. 


147 A 


inn 


325 


gil2836077 


Mus musculus 


putative 


1365 


95 


325 


AAE00586 


Homo sapiens 


Human nuclear cell adhesion molecule 
homologue, NCAM d 2 protein. 


1303 


49 


326 


gil5278193 


Homo sapiens 


MAGI-1C beta mRNA, complete cds, 
alternatively spliced. 


1492 


100 


326 


gi2702351 


Mus musculus 


putative membrane-associated 
guanylate kinase 1 


1112 


83 


326 


gi5817255 


Homo sapiens 


mRNA; cDNA DKFZp434B203 (from 
clone DKFZp434B203); partial cds. 


739 


100 


327 


AAB01432 


Homo sapiens 


Human TANGO 239 (form 2). 


3675 


99 


327 


AAB01426 


Homo sapiens 


Human TANGO 239. 


2700 


100 


327 


AAB00036 


Homo sapiens 


Human TANGO 239 partial sequence. 


2483 


97 


328 


gi7243117 | 


Homo sapiens 


mRNA for KIAA1368 protein, partial 
cds. 


5542 


100 


328 


AAY71460 


Homo sapiens 


Human semaphorin 6A-1 . 


5422 


98 


328 


gil0187891 


Homo sapiens 


unnamed protein product 


5422 


98 


329 


gil3676461 


Macaca 
fascicularis 


hypothetical protein 


2193 


75 
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329 


gi4589566 


Homo sapiens 


mRNA for KIAA0961 protein, 
complete cds. 


2190 


75 


329 


gi456269 


Mus musculus 
domesticus 


zinc finger protein 30 


2073 


71 


330 


AAB94295 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14747. 


3062 


99 


330 


gil0434454 


Homo sapiens 


cDNA FLJ12768 fis, clone 
NT2RP2001576, weakly similar to 
HYPOTHETICAL 62.2 KD PROTEIN 
C4G8.12C IN CHROMOSOME I. 


3062 


GO 


330 


gi7291781 


Drosophila 
melanoca^ter 

UlVU»UwgHi> Ivl 


CG3419 gene product 


471 


32 


331 


gil2852801 


Mus musculus 


putative 


1185 


95 


331 


eil2314230 


Homo <5anipns 


rTiitnan T^lTvJA cpmiptir'p ft>nm pIatip 
■LXUIlLaU oC^UCUUC XXUIU ClULLC 

RP5-846F1 3 on chromo^nmp 1 W) 1 1 - 
22. 1 Contains part of the PPAP2C 
foliostihatidic acid filifwiVhatfl^e tvnp *)r\ 
gene, ESTs, STSs and GSSs, complete 
sequence. 


y t j 


1 AH 
1UU 


331 


gi7020303 


Homo sapiens 


cDNA FLJ20300 fis clone HEP06465 


748 


56 


332 


gil2309630 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-438B23 on chromosome 9 
Contains a novel gene for a neuronal 
leucine-rich repeat protein, ESTs, STSs 
and GSSs, complete sequence. 


3138 


100 


332 


AAB31161 


Homo sapiens 


Amino acid sequence of a human 
TOLL protein. 


2600 


86 


332 


gil3444976 


Homo sapiens 


unnamed protein product 


2600 


86 


333 


g i4240145 


Homo sapiens 


mRNA for KIAA0828 protein, partial 
cds. 


3226 


99 


333 


gi!4249936 


Homo sapiens 


Similar to S-adenosylhomocysteine 
hydrolase-like 1, clone 
IMAGE:3536052, mRNA, partial cds. 


3202 


100 


333 


AAW56097 


Homo sapiens 


Amino acid sequence of the 0DD4b53 
enzyme. 


2466 


84 


334 


gil3625385 


Homo sapiens 


EPI64 (EPI64) mRNA, complete cds. 


1026 


46 


334 


AAB95321 


Homo sapiens 


Human protein sequence SEQ ID 
NO:17577. 


1023 


50 


334 


gil0435007 


Homo sapiens 


cDNAFU13130fis, clone 
NT2RP3002972, weakly similar to 
Halocynthia roretzi mRNA for HrPET- 
1. 


1023 


50 


335 


gil5862408 


Homo sapiens 


unnamed protein product 


2255 


95 


335 


gil 3272520 


Mus musculus 


pancreatitis-induced protein 49 


2021 


85 


335 


AAE02778 


Homo sapiens 


Human PRO-C-MG.64 protein encoded 
by DNA-C-MG.64-1776 cDNA clone. 


1784 


95 


336 


gil5862408 


Homo sapiens 


unnamed protein product 


2281 


99 


336 


gil3272520 


Mus musculus 


pancreatitis-induced protein 49 


2047 


88 


336 


AAE02778 


Homo sapiens 


Human PRO-C-MG.64 protein encoded 
by DNA-C-MG.64-1776 cDNA clone. 


1810 


99 


337 


gi4545313 


Mus musculus 


prominin-like protein 


1021 


77 


337 . 


gil5042603 


Rattus 
norvegicus 


promrnm 


647 


30 
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337 


AAB94028 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14170. 


642 


29 


338 


gi2978255 


Mus musculus 


myeloid zinc finger protein-2 


212 


42 


338 


AAB54292 


Homo sapiens 


Human pancreatic cancer antigen 
protein sequence SEQ ID NO:744. 


208 


30 


338 


gi8886436 


Homo sapiens 


myeloid zinc finger protein 1 splice 
variants (ZNF42) gene, complete cds, 
alternatively spliced. 


207 


42 


339 


gi3882269 


Homo sapiens 


mRNA for KIAA0774 protein, partial 
cds. 


5974 


99 


339 


gil2860422 


Mus musculus 


putative 


692 


96 


339 


gil5424451 


Homo sapiens 


hATIP3 


606 


36 


340 


AAB36617 




Tinman PT PYWT *}Q rtrnfairt CMi)pn/>P 

fiuman jt Lr£jj\ri i-jy protein sequence 


Do*f 


1 Af> 
LuU 


340 


gi8218050 


Homo sapiens 


Human DNA sequence from clone 

RP 1 -1 87 J 1 1 on chrftTnrt<iftmp (\n 1 1 1 - 

■LV-l 1 lO I J 1 X VJLL VsLU UJUUUoULU.0 VJl^l 1 . X — 

22.33. Contains the gene for a novel 
protein similar to S. pombe and S. 
cerevisiae oredicted oroteins the eerie 
for a novel protein siinilar to protein 
kinase C inhibitors, the 3* end of the 
gene for a novel protein similar to 
Drosophila L82 and predicted worm 
proteins, ESTs, STSs, GSSs and two 
putative CpG islands, complete 
sequence. 


562 


100 


340 


gil3540300 


Mus musculus 


nucleolar protein C7B 


415 


66 


341 


gil4583268 


Homo sapiens 


cytoplasmic protein mRNA, complete 
cds. 


628 


62 


341 


gi2104769 


Homo sapiens 


echinoderm microtubule-associated 
protein homolog HuEMAP mRNA, 
complete cds. 


560 


65 


341 


gi4406218 


Homo sapiens 


echinoderm microtubule-associated 
protein-like EMAP2 mRNA, complete 
cds. 


495 


59 


342 


AAB60099 


Homo sapiens 


Human transport protein TPFT-19. 


1616 


93 


342 


gi7294748 


Drosophila 
melanogaster 


CG7616 gene product 


580 


43 


342 


gil4714781 


Mus musculus 


RUCEN cDNA 2610005A10 gene 


441 


35 


343 


AAB94374 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14915. 


3938 


99 


343 


gil0434690 . 


Homo sapiens 


cDNA FIJI 2921 fis, clone 
NT2RP2004600. 


3938 


99 


343 


gi5689736 


Homo sapiens 


mRNA for myopodin. 


883 


34 


344 


AAY72604 


Homo sapiens 


Human Electron Transfer Protein, 
ETRN-2. 


717 


100 


344 


gil0953950 


Geochelone 
carbonaria 


alpha-D chain hemoglobin 


407 


54 


344 


gi4455876 


Cairina 
moschata 


alpha D-globin 


398 


53 


345 


AAY72604 


Homo sapiens 


Human Electron Transfer Protein, 
ETRN-2. 


668 


78 


345 


gil0953950 


Geochelone 


alpha-D chain hemoglobin 


359 


43 
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carbonaria 








345 


gi4455876 


Cairina 
moschata 


alpha D-globin 


349 


41 


346 


gi8655669 


Homo sapiens 


mRNA; cDNA DKFZp547C176 (from 
clone DKFZp547C176). 


1053 


100 


346 


AAB42048 


Homo sapiens 


Human ORFX ORF1812 polypeptide 
sequence SEQ ID NO:3624. 


840 


100 


346 


gi7298852 


Drosophila 
melanogaster 


CG10068 gene product 


601 


40 


347 


gil5778899 


Homo sapiens 


Similar to f-box only protein 17, clone 
MGC: 1 1 162 IMAGE:3841901, mRNA, 
complete cds. 


1537 


99 


347 


gi9280060 


Macaca 
fascicularis 


unnamed protein product 


1435 


95 


347 


gil5214527 


Homo sapiens 


Similar to f-box only protein 17, clone 
MGC:9379 IMAGE:3864760, mRNA, 
complete cds. 


857 


56 


348 


AAG64860 


Homo sapiens 


Heart muscle cell differentiation related 
protein SEQ ID NO: 6 1 . 


1079 


90 


348 


AAB99931 


Homo sapiens 


Human MesPl protein sequence SEQ 
IDNO:61. 


1079 


90 


348 


gil3623241 


Homo sapiens 


Similar to mesoderm posterior 1, clone 
MGC: 10676 IMAGE:3944350, mRNA, 
complete cds. 


1079 


90 


349 


gi4235144 


Homo sapiens 


chromosome 19, BAC 39498 (CTT-B- 
26X23), complete sequence. 


387 


100 


349 


gi8163824 


Homo sapiens 


krueppel-like zinc ringer protein HZF2 
mRNA, complete cds. • 


290 


74 


349 


AAY39779 


Homo sapiens 


CBMACD04 protein sequence. 


286 


71 


350 


gi7673618 


Mus musculus 


. ubiquitin specific protease 


2016 


73 


350 


gi5689463 


Homo sapiens 


mRNA for KIAA1063 protein, partial 
cds. 


2000 


64 


350 


gil6198231 


Drosophila 
melanogaster 


LD43147p 


1188 


46 


351 


gil3540193 


Homo sapiens 


isopentenyl pyrophosphate isomerase 1 
(IDI1), HT009-like protein, and 
isopentenyl pyrophosphate isomerase 
type 2 (1012) genes, complete cds. 


1202 


100 


351 


gil3925766 


Homo sapiens 


isooentenvl diohosohate rifmethvlallvl 
JK J wyiiva v i hi i p mutt mjf xa j * 

diphosphate isomerase 2 (IDI2) gene, 
exon 4 and complete cds. 


1202 


mo 


351 


gil3925769 


Homo sapiens 


isopentenyl diphosphate dimethylallyl 
diphosphate isomerase 2 (IDI2) 
mRNA, complete cds. 


1202 


100 


352 


gil3561001 


Homo sapiens 


Human DNA sequence from clone 
RP1 1-528A10 on chromosome 6 
Contains an IMPDH1 (IMP (inosine 
monophosphate) dehydrogenase 1) 
pseudogene, an RNA helicase 
pseudogene, a novel gene similar to 
KIAA0161, ESTs, STSs and GSSs, 
complete sequence. 


950 


100 


352 


gil3991706 


Mus musculus 


UbcM4-interacting protein 4 


655 


53 
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Trfpiifitv 

AUCllUlJ 


352 


eil 136384 


Homo saoien<5 

x.xxjxxmj 0a|yiwuo 


Human mRNA for KIAA01 61 j»ene 

J. L U-lLLU.il lltL\x y*X Xvi. XVJLxXxVV/ A uX gvUC) 

complete cds. 


651 




353 


eil3561001 


Homo saniens 

X XKJXXXKJ uU^/iviiO 


TTuman TTNA sentience frnm elnnp 
RP1 1-528A10 on chromosome 6 
Contains an IMPDH1 (IMP (inosine 
monophosphate) dehydrogenase 1) 
pseudogene, an RNA helicase 
pseudogene, a novel gene similar to 
KIAA0161 ESTs STSsandGSSs 
complete sequence. 


709 


7Q 


353 


gil3991706 


Mus musculus 


UbcM4-interacting protein 4 


506 


45 


353 


eil 136384 


Homo sanierK 

J. XUXXXXJ OOLrlViit} 


Human mRNA forKIAAOlfil i^ene 

A X lAJLi J UJJ XXXIXlT|.r* 1U1 J\i/iAV4Ul gCXlv, 

complete cds. 




44 


354 


AAB74446 


Homo sapiens 


Human protease-inhibitor like protein. 


2759 


100 


354 


gil2053227 


Homo sapiens 


mRNA; cDNA DKFZp434B044 (from 

clone DKF7t>434R044V rnrrmlpte cHq 


2756 


99 


354 


gil5593902 


Homo sapiens 


unnamed protein product 


2743 


99 


355 


AAB94358 


T-Tnmr* canipriQ 
xxoxxiu aapivxia 


Hitman rtrrtfAin cpnupn^p QT7fi 1 1 1 
rxuixxaxi piuiciii bcqucncc ojjii^ XXJ 

NO:14883. 


17851 
1 /Bo 


Oft 


355 


eil 0434632 


T-Trimn c^rvi pti o 
Xx\JXlMJ aaLJiGXXo 


rDN A FT Tl 9886 fie rlrme 

NT2RP2004041, weakly similar to 
S YNAPSINS IA AND IB 


1788 
1 / Oo 


Q8 

yo 


355 


gil2052738 


Homo sapiens 


mRNA; cDNADKFZp564H1322 
(from clone DKFZd564H1322V 
complete cds. 


1788 


98 


356 


gil3436437 


Homo sapiens 


Similar to RKEN cDNA 5730438N18 
gene, clone MGC:4399 
IMAGE:2905957, mRNA, complete 
cds. 


1634 


99 
yy 


356 


gil5030091 . 


Mus musculus 


Similar to RKEN cDNA 5730438N18 
gene 


1508 


91 


356 


AAB43372 


Homo sapiens 


Human ORFX ORF3 1 36 polypeptide 
sequence SEQ ID NO:6272. 


1464 


91 


357 


AAB73511 


Homo sapiens 


Human transferase HTFS-1 8, SEQ JD 
NO:18. 


1880 


99 


357 


AAG74560 


Homo sapiens 


Human colon cancer antigen protein 
SEQ ID NO:5324. 


450 


98 


357 


AAG02792 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
6873. 


324 


96 


358 


gi7673618 


Mus musculus 


ubiquitin specific protease 


2711 


95 


358 


gi5689463 


Homo sapiens 


mRNA for KIAA1063 protein, partial 
cds. 


2382 


78 


358 


gi5823525 


Drosophila 
melanogaster 


ubiquitin-specific protease nonstop 


1305 


49 


359 


AAB94775 


Homo sapiens 


Human protein sequence SEQ JD 
NO: 15864. 


1022 


100 


359 


gil0435984 


Homo sapiens 


cDNA FU13842 fis, clone 
THYRO1000793. 


1022 


100 


359 


gi2340162 


Xenopus 
laevis 


dsRBP-ZFa 


380 


44 


360 


gi3676086 


bacteriophage 
PS119 


gpl9 


291 


59 


360 


gil778468 


Escherichia 


hypothetical protein 


287 


59 
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Score 


% 
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coli 








360 


gil786768 


Escherichia 
coliK12 


bacteriophage lambda lysozyme 
homolog 


287 


59 


361 


gil3544003 


Homo sapiens 


clone IMAGE:3677165, mRNA, 
partial cds. 


2172 


88 


361 


gi3 169073 


Schizosacchar 
omyces pombe 


phenylalanyl-trna synthetase, 
mitochondrial precursor 


233 


33 


361 


gil3877969 


Arabidopsis 
thaliana 


putative phenylalanine-tRNA 
synthetase 


228 


35 


362 


gi293694 


Mus musculus 


laminin receptor 


370 


49 


362 


gil3277921 


Mus musculus 


kminin receptor 1 (67kD, ribosomal 
protein SA) 


367 


49 


362 


gi4633839 


Mus musculus 


37kDa oncofetal antigen 


367 


49 


363 


gil5082271 


Homo sapiens 


testes development-related NYD-SP21 
mRNA, complete cds. 


1876 


100 


363 


gi6807923 


Homo sapiens 


mRNA; cDNA DKFZp434H092 (from 
clone DKFZp434H092); partial cds. 


1620 


100 


363 


gi7294427 


Drosophila 
melanogaster 


CG8797 gene product 


118 


21 


364 


AAE01355 


Homo sapiens 


Human gene 4 encoded secreted 
protein HRABV43, SEQ ID NO:77. 


2724 


97 


364 


gil2836042 


Mus musculus 


putative 


2607 


93 


364 


AAE01380 


Homo sapiens 


Human gene 4 encoded secreted 
protein HRABV43, SEQ ID NO:102. 


2500 


97 


365 


gil0439688 


Homo sapiens 


cDNA: FU23 109 fis, clone 
LNG07754. 


2809 


99 


365 


gi9622093 


Mus musculus 


E-cadherin binding protein E7 


2768 


97 


365 


AAG01765 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
5846. 


737 


99 


366 . 


gil2854995 


Mus musculus 


putative 


844 


71 


366 


gil0241691 


Homo sapiens 


Novel human gene mapping to 
chomosome 22. 


791 


99 


366 


gil4602790 


Homo sapiens 


DKFZP566F0546 protein, clone 
MGC:2444 IMAGE:2822570, mRNA, 
complete cds. 


791 


99 


367 


gil5082283 


Homo sapiens 


Similar to small glutamine-rich 
tetratricopeptide repeat (TPR)- 
containing, clone MGC: 10496 
IMAG&3625993, mRNA, complete 
cds. 


720 


100 


367 


gi3377591 


Homo sapiens 


full length insert cDNA YN88E09. 


592 


100 


367 


gil5488015 


Homo sapiens 


TPR-containing co-chaperone mRNA, 
complete cds. 


450 


64 


368 


gi9104819 


Xylella 
fastidiosa 9a5c 


hypothetical protein 


151 


43 


368 


AAY59981 


Homo sapiens 


Human endometrium tumour EST 
encoded protein 41. 


128 


46 


368 


AAE03351 


Homo sapiens 


Human gene 4 encoded secreted 
protein fragment, SEQ ID NO: 126. 


121 


58 


369 


gi5817053 


Homo sapiens 


mRNA; cDNA DKFZp586D0824 
(from clone DKFZp586D0824); partial 
cds. 


571 


43 


369 


gil 5530285 


Homo sapiens 


clone MGC:24275 IMAGE:3950542, 


571 


43 
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% 

XUvUUlj 








mRNA, complete cds. 






369 


eil 35 69476 


Mil*? mus cuius 


tmmunitv-associated nucleotide 4 

HI UlliUmjr OdOUwlulVU UUwlvVUUv r 


540 


42 


370 


gi8453103 


Homo sapiens 


zinc finger protein mRNA, complete 
cds. 


1296 


58 


370 


gil5012179 


Homo saniens 


zinc finffer orotein 1 6 flCOX CA clone 
MGC:15145 IMAGE:3949487, mRNA, 
conmlete ctls 


1296 


JO 


370 


gi498721 


Homo saniens 


H saniens HZF10 mRNA for rinc 
finffer nrotein 


1279 




371 


gil 5929964 


Homo saniens 


Similar to hvnotherical nrotein 
FU10702 clone MGC-2 1954 
IMAGE:4391821, mRNA, complete 
cds. 




10ft 

L\J\J 


371 


AAB42336 


Homo sapiens 


Human ORFX ORF2 100 polypeptide 
sequence SEQ ID NO:4200. 


932 


93 


371 


AAB93080 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 11912. 


923 


91 


372 


gi7328451 


Mus musculus 


sialidase 


893 


44 


372 


AAB93971 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14038. 


866 


42 


372 


AAW73964 


Homo sapiens 


Human sialidase protein sequence. 


866 


42 


373 


gil480005 


Mus musculus 


Zic4 protein 


1490 


86 


373 


AAB 14349 


Homo sapiens 


Human Zicl protein. 


1102 


67 


373 


gil 208429 


Homo sapiens 


mRNA for Zlc protein, complete cds. 


1102 


67 


374 


gil 28601 14 


Mus musculus 


putative 


876 


40 


374 


gil61958 


Trypanosoma 
cruzi 


surface antigen 


177 


23 


374 


gil 334643 


laevis 


APKfr nrecursft?' rm\tpin 




zu 


375 


AAY99349 


Homo sapiens 


Human PROl 1 10 (UNQ553) amino 
acid seauence SEO ID NO*3 1 


1683 


100 


375 


AAB19729 


Homo sapiens 


Human SECX Clone 4339264-2 

encnded nrntein 


1683 


100 


375 


AAB15549 


Homo sapiens 


Human immune system molecule from 
Incyte clone 2774913. 


1683 


100 


376 


gil2746394 


Homo sapiens 


CUG-BP and ETR-3 like factor 4 
XysEiLtT**) nuviN/v, compieie cos. 


936 


100 


376 


gil3278792 


Homo sapiens 


Bruno (Drosopnila) -like 4, RNA 
binding protein, clone MGC:2693 
IMAGE:2820541,mRNA, complete 
cds. 


911 


98 


376 


gil2804985 


Homo sapiens 


Similar to etrl, clone MGC:4320 
IMAGE:2820541, mRNA, complete 
cds. 


911 


98 


377 


gil2746394 


Homo sapiens 


CUG-BP and ETR-3 like factor 4 
(CELF4) mRNA, complete cds. 


905 


89 


377 


gil3278792 


Homo sapiens 


Bruno (Drosopnila) -like 4, RNA 
binding protein, clone MGC:2693 
IMAGE:2820541, mRNA, complete 
cds. 


880 


88 


377 


gil2804985 


Homo sapiens 


Similar to etrl, clone MGC:4320 
1MAGE:2820541, mRNA, complete 
cds. 


880 


88 
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o / o 


gliZO'tlUOU 


1VLUS muscuius 


puiauve 


fine 


ID 


378 


gi7293285 


Drosophila 
meianogasier 


CG4768 gene product 


239 


37 


378 


gil938566 


Caenorhabditis 
elegans 


Hypothetical protein C48B6.3 


123 


38 


379 


gi3880385 


Caenorhabditis 
elegans 


predicted using Genefinder~contains 
similarity to Jrtam domain: Fr014o4 
(Nematode cuticle collagen N-terminal 
domain), ocore—M.j, b-value— o.le-l.z, 
in— iMiLijNA Doi yjcy^aH.D comes trom 
mis gene~ci-/iN y\ no i yiC7*fra*t.j comes 
from this gene-cDNA EST yk68dl .5 

vk68dl 3 comes from this pene 


79 


35 


379 


gi6684 


Caenorhabditis 
elegans 


unnamed nrotein oroduct 


79 


35 


379 


gil 56262 


Caenorhabditis 
elegans 


collagen 


79 


35 


380 


AAB85365 


Homo saniens 


Novel Von ^Willebrand/tbrnmbnsnnrin- 
like mature protein sequence. 


657 


94 


380 


AAB85364 


Homo sapiens 


Novel Von AVillebrand/thTombnsoorin- 
like polypeptide. > 


657 


04 


380 


gil2836633 


Mus muscuius 


putative 


651 


59 


381 


gil5024264 


Mus muscuius 


ribosomal protein L35a 


191 


53 


381 


gi57119 


Rattus 
norvegicus 


ribosomal protein L35a (aa 1-1 10) 


191 


53 


381 


gil2846322 


Mus muscuius 


putative 


191 


53 


382 


gil2835133 


Mus muscuius 


putative 


617 


71 


382 


gi7293113 


Drosophila 
me]anogaster 


CG12379 gene product 


283 


72 


382 


gi6042159 


Caenorhabditis 
elegans 


Hypothetical protein F53A3.7 


226 


55 


383 


AAB81053 


Homo sapiens 


Human nrotein HP01640 amino acid 
sequence. 


932 


inn 


383 


gil2841896 


Mus muscuius 


putative 


925 


98 


383 


gi7303144 


Drosophila 
melanogaster 


CG10153 cene nroduct 


612 


65 


384 


gil0440373 


Homo sapiens 


mRNA for FLJ00022 nrotein. nartial 
cds. 


1345 


93 


384 


gil0440396 


Homo sapiens 


mRNA for FLJ0003 1 protein, partial 
cds. 


647 


88 


384 


gil086626 


Caenorhabditis 
elegans 


Hypothetical protein C06A63 


273 


33 


385 


gil2053305 


Homo sapiens 


mRNA; cDN A DKFZp434G099 (from 
clone DKFZp434G099); complete cds. 


1210 


100 


385 


gi25 16239 


Mus muscuius 


Rab33B 


1138 


94 


385 


gil2836564 


Mus muscuius 


putative 


1138 


94 


386 


gi7243247 


Homo sapiens 


mRNA for KIAA1433 protein, partial 
cds. 


3232 


100 


386 


AAB94053 


Homo sapiens 


Human protein sequence SEQ ID 
NO:14222. 


3223 


99 


386 


gil 3096872 


Mus muscuius 


Unknown (protein for MGC7720) 


2906 


89 


387 


gil4599491 


Homo sapiens 


small proline-rich protein 2F (SPRR2F) 


458 


100 
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gene, complete cos. 






387 


gil4599489 


Homo sapiens 


small proline-rich protein 2E 
^orKKZEj gene, complete cos. 


444 


95 


387 


gi338423 


Homo sapiens 


Human small proline rich protein 

/-——TTX -1>\T A C\1f\ 

{spruj mKJNA, clone yju. 


434 


94 


JOO 


oitf) 1 fifiQQ 
givn/ luoyy 


IVaLCllS 


r-oox protein riiL^ 


1449 


99 


388 


gi!4043139 


Homo sapiens 


RIKEN cDNA 261051 1F20 gene, 
clone MGC:15482 IMAGE:2987858, 
mRNA, complete cds. 


1383 


100 


388 


gil2848653 


Mus musculus 


putative 


1371 


99 


389 


gi2853265 


Rattus 
norvegicus 


jun dimerization protein 2 


800 


96 




gi 12248392 


Mus musculus 


transcriptional inhibitory factor 


795 


95 


389 


gi6648146 


Homo sapiens 


chromosome 14 clone CTD-2317F5 
map 14q24.3, complete sequence. 


481 


100 


390 


gil5277240 


Homo sapiens 


genomic DNA, chromosome 6p21.3, 
HLA Class I region, section 17/20. 


1296 


100 


390 


gil 1875405 


Homo sapiens 


HZFwl protein mRNA, complete cds. 


1291 


99 


390 


gil 1875407 


Homo sapiens 


HZFw2 protein mRNA, complete cds. 


773 


99 


391 


gi6572201 


Homo sapiens 


Human DNA sequence from clone 
CITF22-27C3 on chromosome 
22q 13. 1-13.31 Contains a gene for a 
novel protein (DJ1 163 Jl .2) and part of 
a gene for a novel protein (DJ1 163J1.3, 
similar to mouse B99), ESTs, STSs and 
GSSs, complete sequence. 


863 


100 


391 


gi4469186 


Homo sapiens 


Human DNA sequence from clone 
RP5-1163J1 on chromosome 22ql3. 2- 
13.33 Contains the 3* part of a gene for 
a novel KIAA0279 LIKE EGF-like 
domain containing protein (similar to 
mouse Celsrl, rat MEGF2), a novel 
gene for a protein similar to C. elegans 
BUU33.16 ana bactenal tRNA (5- 
MemylarnbomemyW-thiouridylate)- ' 
Methyltransferases, and the 3' part of a 
novel gene for a protein similar to 
mouse B99. Contains ESTs, GSSs and 
nutative OnO iclanrfa cnirml^tp 
sequence. 


863 


100 


391 


AAB92551 


Homo sapiens 


Human protein sequence SEQ ID 
NO:10735. 


862 


96 


392 


gi5001720 


Mus musculus 


odd-skipped related 1 protein ! 


1413 


97 


392 


gil5778246 


Mus musculus 


odd-skipped related 2 


924 


66 


392 


gil5488723 


Mus musculus 


Unknown (protein for MGC: 19171) 


924 


66 


393 


AAB94364 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 14895. 


2700 


99 


393 


gil0434650 


Homo sapiens 


cDNA FU12895 fis, clone 
NT2RP2004187, weakly similar to 
ZINC FINGER PROTEIN 38. 


2700 


99 


393 


gil3623217 


Homo sapiens 


Similar to hypothetical protein 
FLJ12895, clone IMAGE:3533093, 


2150 


99 
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mRNA, partial cds. 






394 


gil2053105 


Homo sapiens 


mRNA; cDNA DKFZp434Kl 1 1 (from 
clone DKFZp434Kl 1 1); complete cds. 


3116 


100 


394 


gi2282582 


Mus rausculus 


actin-binding protein 


2402 


74 


394 


AAR94386 


Homo sapiens 


Human neural cell protein marker 
RR/B. 


2400 


74 


395 


gi207145 


Rattus 
norvegicus 


synaptotagmin II 


2128 


95 


395 


gi7739733 


Mus musculus 


synaptotagmin II 


2121 


95 


395 


gi688412 


Mus musculus 


synaptotagrmnII/IP4BP 


2121 


95 


396 


gil5487674 


Homo sapiens 


OSBP-related protein 1 mRNA, 
complete cds. 


3220 


99 


396 


AAB92611 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 10880. 


703 


100 


396 


AAY97291 


Homo sapiens 


Lipid associated protein (LIPAP) 
2764333CD1. 


703 


100 


397 


gil 1231085 


Macaca 
fascicularis 


hypothetical protein 


490 


76 1 


397 


gi2447128 


Paramecium 
burs aria 
Chlorella virus 
1 


contains 10 ankyrin-like repeats; 
similar to human ankyrin, corresponds 
to Swiss-Prot Accession Number 
P16157 


212 


33 


397 


gi6634025 


Homo sapiens 


mRNA for KIAA0379 protein, partial 
cds. 


203 


38 


398 


AAB21047 


Homo sapiens 


Human nucleic acid-binding protein, 
NuABP-51. 


1082 


100 


398 


gi833629 


Xenopus 
laevis 


nucleoplasmin 


459 


49 


398 • 


gi64940 


Xenopus 
laevis 


nucleoplasmin(AA 1-200) 


435 


46 


399 


gil5919272 


Homo sapiens 


putative forkhead/winged-helix 
transcription factor (FOXP2) mRNA, 
complete cds. 


596 


84 


399 


gi2565057 


Homo sapiens 


CAGH44 mRNA, partial cds. 


596 


84 


399 


gil4582802 


Mus musculus 


forkhead-related transcription factor 2 


588 


82 


400 


AAB08199 


Homo sapiens 


Amino acid sequence of human 
diacylglycerol kinase beta 
(DAGKbeta). 


4217 


99 


400 


gil0279722 


Homo sapiens 


unnamed protein product 


4217 


99 


400 


gi485398 


Rattus 
norvegicus 


90kDa-diacylglycerol kinase 


4046 


95 


401 


gi7670446 


Mus musculus 


unnamed protein product 


1295 


87 


401 


gil3 185203 


Homo sapiens 


unnamedj>roteinj)roduct 


799 


83 


401 


AAY31642 


Homo sapiens 


Human transport-associated protein-4 
(TRANP^). 


466 


35 


402 


gil2837990 


Mus musculus 


putative 


985 


69 


402 


gi5668737 


Mus musculus 


UBE-lc2 


661 1 


50 


402 


AAB94645 


Homo sapiens 


Human protein sequence SEQ ID 
NO:15538. 


426 


52 


403 


gil0439821 


Homo sapiens 


cDNA: FU23209 fis, clone 
ADSH00512. 


2596 


99 


403 


gil0440353 


Homo sapiens 


mRNA for FLJ0001 1 protein, partial 


1448 


97 
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cds. 






403 


gi8217420 


Homo sapiens 

* 


Human DNA sequence from clone 
RP1 1-108L7 on chromosome 10. 
contains part of the gene for a novel 
Insulin-like growth factor binding type 
protein with Kazal-type serine protease 
inhibitor domain, the gene for a novel 
protein similar to rat tricarboxylate 
carrier, the gene for a novel PDZ 
(DHR, GLGF) domain protein, the 
gene for a novel protein similar to 
JSJAAUjDZ, JsJAAU,3*U ana rugu 
hypothetical protein 2, the gene for a 
novci proicin similar 10 r lasmo niiirn 
POM1 and C. elegans F46G11.1, a 

puuillVC JULUVvI gCUC, UiC O Ei 1VX/\H \J gCIlC 

for semaphorin 4G and a novel gene. 
Contains ESTs STSs GS5?s and <sevpn 
putative CpG islands, complete 
sequence. 


1026 


100 


404 


AAB42219 


Homo sapiens 


Human ORFX ORF1983 polypeptide 
sequence SEQ ID NO:3966. 


2230 


96 


404 


gi34 17297 


Homo sapiens 


Human Chromosome 16 BAC clone 
CIT987SK-A-635H12, complete j 
sequence. 


2250 




404 


gil5559282 


Homo sapiens 


clone MGC:20208 IMAGE:3936339, 
mRNA, complete cds. 


1021 


53 


405 


gil3365905 


Macaca 
fascicularis 


hypothetical protein 


1154 


99 


405 


AAB15537 


Homo sapiens 


Human immune system molecule from 
Incyte clone 275 1129. 


911 


100 


405 


AAE04891 


Homo sapiens 


Human transporter and ion channel-4 
(TRICH-4) protein. 


360 


39 


406 


gi262843 


Rattussp. 


neurotransmitter transporter 


3709 


96 


406 


gi545078 


Rattussp. 


Na+/Cl(-)-dependent neurotransmitter 
transporter 


3694 


96 


406 


AAR88390 


Homo sapiens 


Human neurotransmitter transporter 
protein. 


3668 


96 


407 


AAB31212 


Homo sapiens 


Amino acid sequence of human 
polypeptide PRO6004. 


728 


100 


407 


AAB44331 


Homo sapiens 


Human PR04993 protein sequence 
SEQIDNO:612. 


717 


100 


407 


gi4519558 


Rattus • 
norvegicus 


Kilon 


667 


94 


408 


gil5277972 


Mus mus cuius 


Similar to DnaJ (Hsp40) homolog, 
subfamily B, member 1 


808 


49 


408 


gi7804472 


Mus musculus 


heat shockj)rotein 40 


808 


49 


408 


AAB72675 


Homo sapiens 


Human HP Jl. 


804 


48 


409 


gil2841015 


Mus musculus 


putative 


798 


52 


409 


AAB60114 


Homo sapiens 


Human transport protein TPPT-34. 


787 


51 


409 


gil3435410 


Mus musculus 


Similar to RDCEN cDNA 18 10012H11 
gene 


768 


53 


410 


gi488555 


Homo sapiens | Human zinc finger protein ZNF135 


1241 


52 
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SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% 
Identity 








mRNA, complete cds. 






410 


AAY73346 


Homo sapiens 


HTRM clone 619699 protein sequence. 


1238 


49 


410 


AAB43912 


Homo sapiens 


Human cancer associated protein 
sequence SEQ ID NO: 1357. 


1231 


49 


411 


gi837292 


Rattus 
norvegicus 


S100A1 gene product 


278 


59 


411 


AAB45531 


Homo sapiens 


Human S100A1 protein. 


274 


57 


411 


gil 1228039 


Homo sapiens 


S100A1 cDNA 


274 


57 


412 


AAB19851 


Homo sapiens 


Human muscle-specific protein Ozz. 


1504 


100 


412 


gil3929456 


Homo sapiens 


Human DNA sequence from clone 
RP3-337018 on chromosome 20ql2- 
13.1. Contains the PLPT gene encoding 
Phospholipid Transfer Protein, the 
PPGB gene coding for Lysosomal 
Protective Protein precursor (EC 
3.4.16.5, Cathepsin A, 
Carboxypeptidase C) and the gene 
encoding peroxisomal acyl-CoA 
thioesterase (PTE1, thioesterase H), 
four novel genes, me gene for a novel 
protein similar to Drosophila 
Neuralized (Neu) and the 5' end of an 
isofonn of the TNNC2 gene for fast 
uupuimi v/i. i^oniains ujicc Lpu 
islands, ESTs, STSs and GSSs, 

comnlpte Qpnnenrp 


1504 


100 


412 


gil2835750 


Mus musculus 


putative 


1328 


89 


413 


£12847182 


Mus musculus 


putative 


875 


87 


413 


ei4884173 




misJNA, CJUINA JJrwrZ.pD04<aUyoZ 

( frnm rlnnp TYK'T?7ri^f\Af2ftQR9V mar-rial 
^uum nunc xJS\X'juyj\j l t\j\}yo/,jy pdilldi , 

cds. 


O40 


100 


413 


gil0047333 


Homo sarnprK 


mRNA for TCT A A 1 fi9R nrnfpin nartfal 

cds. 




HZ. 


414 


gi7959343 


Homo sapiens 


mRNA for KIAA1538 protein, partial 
cds. 


3286 


100 


414 


AAB42721 


Homo sapiens 


Human ORFX ORF2485 polypeptide 
seauence SEO TD NO '4070 


382 


100 


414 


AAB42764 


Homo samens 


Human ORFX ORF2^2R nnlvnpntiHp 
seauence SEO ID NO*5056 


JJJ 


HI 


415 


gil4043332 


Homo sapiens 


Similar to ring finger protein 23, clone 
MGC:2475 MAGE:3051389, mRNA, 
complete cds. 


1006 


43 


415 


gil0716078 


Mus musculus 


testis-abundant finger protein 


995 


42 


415 


gil0716076 


Homo sapiens 


mRNA for testis-abundant finger 
protein, complete cds. 


966 


40 


416 


gi3599509 


Mus musculus 


rho/rac-interacting citron kinase 


1507 


61 


416 


gi3360512 


Rattus 
norvegicus 


Citron-K kinase 


1505 


89 


416 


gi3599507 


Mus musculus 


rho/rac-interacting citron kinase short 
isofonn 


1503 


89 


417 


gi2358070 


Mus musculus 


trypsinogen 1 


898 


65 


417 


gi603903 


Gallus gallus 


trypsinogen 


408 


36 


417 


gi65163 


Xenopus 


trypsin precursor 


405 


38 
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SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% 

IVfontifv 

jLUcnuty 






laevis 








418 




norvegicus 


cereurogiycdn 


1 1 JZ 


o / 






nomo Sapiens 


numan r ku/uj ^urvv^joyj pro re in 
sequence SEQ ID NO: 109. 


j/U 


HO 


418 

*TlO 




nomo Sapiens 


xiuman vjruo protein. 


J /u 


Afi 
40 


419 


AAM06489 


Homo sapiens 


Human foetal protein, SEQ ID NO: 
220. 


376 


82 


410 

my 


gllZOOJj /O 


Mus musculus 


putative 


230 


31 


AiQ 




Homo sapiens 


Human four disulfide core domain 
(TOCD)<ontaining protein. 


222 


31 


HZU 


A A TiAIZAI 


Homo sapiens 


Human ORF X ORF2325 polypeptide 
sequence SEQ ID NO:4650. 


5075 


100 




crZCA 1 QQ« 


Homo sapiens 


mRNA; cDNA DKFZp434N074 (from 
clone DKFZp434N074). 


5070 


99 




gl4D8yS32 


Homo sapiens 


mRNA for KIAA0944 protein, partial 

_ j_ 
cos. 


3375 


61 


421 


gil0438804 


Homo sapiens 


cDNA: FU22419 fis, clone 

rlKlJUojyj. 


1026 


60 


421 


gil3938187 


Homo sapiens 


hypothetical protein FLJ22419, clone 
jyiov-.i4yuu iMAOii:jo4/ /oj, rnKJNA, 
compicic cus. 


1026 


60 


421 


gi6690339 


Mus musculus 


hematopoietic zinc ringer protein 


717 


47 


422 




X.1UXI1U aapiciio 


juLuinan proiem sequence ocv^ id 
NO:15739. 


10 to 


nn 
yy 


422 


ffil0435784 


Wnmn canipnc 


wuina. rijj i jOjj ub, cione 
PLACE2000111. 


iO/o 


QQ 

yy 


422 


gi5706454 


Homo sapiens 


mRNA for Natural killer cell p44 
related gene 2 (NKp44RG2). 


158 


29 


423 


gil5026974 


Homo sapiens 


mRNA for obscurin (OBSCN gene). 


2713 


96 


423 


AAB95162 


Homo sapiens 


Human protein sequence SEQ ID 
NO: 17205. 


1173 


86 




giioyjoi /u 


Homo sapiens 


clone IMAGE:2961284, mRNA, 
partial cos. 


540 


26 


A? A 


gllZoOl 


Mus musculus 


putative 


523 


51 


424 


AAE02058 


Homo sapiens 


Human four disulfide core domain 
(FDCD)^ontaining protein. 


485 


38 


424 


gil2655452 


Homo sapiens 


mRNA for keratin associated protein 
4.7 (KRTAP4.7 gene). 


485 


40 


425 


gil2830335 


Homo sapiens 


Human TiW A cAnnpn^o #V/>wi /»lrv*»r* 

RP11-550O8 on chromosome 20. 
Contains a novel gene encoding a 
protein kinase, an RPL7 (60S 
Ribosomal Protein L7) pseudogene, a 
CpG island, ESTs, STSs and GSSs, 
complete sequence. 


xuoz 


QQ 

yy 


425 


AAB65688 


Homo sapiens 


Novel protein kinase, SEQ ID NO: 216. 


1732 


100 


425 


AAB65690 


Homo sapiens 


Novel protein kinase, SEQ ID NO: 218. 


1184 


69 


426 


gi388518 


Homo sapiens 


Human V beta 5.5 mRNA for a new T 
cell receptor. 


627 


95 


426 


gi36173 


Homo sapiens 


H.sapiens rearranged T-cell receptor 
beta chain mRNA. 


613 


94 


426 


gil 552509 I Homo sapiens 


Human germline T-cell receptor beta 


606 


100 



186 



WO 02/081731 



PCT/US02/01222 



Table 2 



SEQ ID NO: 


Accession No. 


Species 


Description 


Score 


% 
Identity 








chain TCRBV13S1, TCRBV6S8A2T, 
TCRBV5S6A3N2T, TCRBV13S6x\2T, 
TCRBV6S9P, TCRBV5S3A2T, 
TCRBV13S8P, TCRBV6S3A1N1T, 
TCRBV5S2, TCRBV6S6A2T, 
1 CKB V5o7r, 1 CKB V 1 3 S4, 
TCRBV6S2A1N1T, TCRBV5S4A2T, 
TCRBV6S4A1, TCRBV23S1A2T, 
1CKBV12MA1JN2, 1CKBV21S2A2, 
TCRBV8S1, TCRBV8S2A1T, 
luKBVoco, TCKBVloSlAlNl, 
TCRBV24S1A3T, TCRBV25S 1 A2PT, 
TCRBV26S1P, TCRBV18S1, 

TCRBV10S1P genes from bases 
257519 to 472940 (section 2 of 3). 






497 


A APA4759 


nomu sapiens 


Human beta-13~galactosyltransferase 
homologue, ZNSSP8. 


A1A 

434 


33 


497 




xiomo sapiens 


unnamed protein product 


A1A 

434 


33 


427 


gil4039836 


Homo sapiens 


beta 1,3 N- 

acetyglucosanunyltransferase Lc3 
synthase mRNA, complete cds. 


434 


33 


49R 


01^0^149 

glJJfOlHZ 


jrioiiiu Sapiens 


Human proteasome sub unit LMP7 
ya.LL%ziv jjivj-r /v^j iiuvinxt., complete cos. 


4Z1Q 


ACS 

4y 


428 


gi38482 


Homo sapiens 


H.sapiens gene for major 

tnQtriPrtTtmarihilitv rnty>T\IpY pnrn/lp^ 
JUJ^LU^vixJ^aixUlxxlj VUxxxplCA CuwUUCU 

proteasome subunit LMP7. 


624 


49 


428 


ei!054747 


Homo saniens 

M.M.\j xxxv/ aauiviio 


H satxiens DMA DMR HT A-71 TPP9 

X.X*d<xJpit/xJLo JL/J.V-LTV, IViVllJj ' * * m/-\—S . X , iir 

LMP2 TAP1 LMP7 TAP2 DOB 

1 /» "XX X XiJL lj J-U.VJLX / y X. jU. A>, U\JUy 

DQB2 and RING8, 9, 13 and 14 genes. 


694 


4Q 


429 


AAG71415 


Homo saniens 


Human nl fjipforv rpc prvf rvr "nn1\mf»r*f"ir1*» 

SEQ ID NO: 1096. 


1 JO 1 


inn 


429 


AAG71594 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1275. 


1344 


83 


429 


AAG72476 


Homo sapiens 


Human OR-like polypeptide query 
sequence, SEQ ID NO: 2157. 


1011 


100 


430 


gil0440063 


Homo sapiens 


cDNA: FU23392 lis, clone HEP17418. 


3045 


100 


430 


gil5214571 


Mus mus cuius 


Unknown fnrotein for 

W XXXVXXV/ TV XX ^llUlwiU lUi 

IMAGE:4207025) 




ou 


430 


gi!770528 


Homo sapiens 


H. sapiens mRNA for translin 
associated zinc finger protein- 1. 


687 


38 


431 


gil2859929 


Mus musculus 


putative 


917 


96 


431 


gil5207935 


Macaca 
fasciculaiis 


hypothetical protein 


301 


96 


431 


gil655637 


Mus musculus 


orf 


147 


27 


432 


gi4585414 


Bacteriophage 
933W 


hypothetical protein 


408 


42 


432 


gi4499798 


Bacteriophage 
933W 


orfl5; homologous to ninG gene 


408 


42 


432 


gi5881629 


Bacteriophage 
VT2-Sa 


hypothetical protein 


408 


42 


433 


gil3161184 


Homo sapiens 


cytochrome P450 2S1 (CYP2S1) 
mRNA, complete cds. 


2615 


100 
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Score 


% 
Identity 


433 


AAB93056 


Homo sapiens 


Human protein sequence SEQ ID 
NO:11860. 


2527 


100 


433 


gil4042396 


Homo sapiens 


cDNA FLJ14699 fis, clone 
NT2RP2006571, moderately similar to 
CYTOCHROME P450 2G1 (EC 
1.14.14,1). 


2527 


100 


434 


gil3445575 


Homo sapiens 


facilitative glucose transporter 
GLUT10 (SLC2A10) rnRNA, complete 
cds. 


2752 


99 


434 


gil3603727 


Homo sapiens 


glucose transporter (GLUT10) rnRNA, 
complete cds. 


2752 


99 


434 


gil 1065680 


Homo sapiens 


Novel human gene mapping to 
chromosome 20, similar to membrane 
transporters. 


2752 


99 


435 


gil3310486 


Homo sapiens 


C2H2 zinc finger protein (SALL3) 
gene, complete cds. 


6094 


99 


435 


gi6688241 


Homo sapiens 


SALL3 gene, exons la, 2 and 3. 


6070 


99 


435 


gil296845 


Mus musculus 


spalt protein 


5089 


84 


436 . 


AAG71445 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1126. 


1312 


85 


436 


AAG71447 


Homo sapiens 


Human olfactory receptor polypeptide, 
SEQ ID NO: 1128. 


924 


61 


436 


gil5293797 


Homo sapiens 


clone OR6M1 olfactory receptor gene, 
partial cds. 


829 


78 


437 


AAB65297 


Homo sapiens 


Human PR09828 protein sequence 
SEQIDNO:511. 


1360 


100 


437 


AAG89178 


Homo sapiens 


Human secreted protein, SEQ ID NO: 
298. 


1360 


100 


437 


AAB84652 


Homo sapiens 


Ammo acid sequence of fibroblast 
growth factor homologue zFGF12. 


1360 


100 


438 


gi53756 


Mus musculus 


minopontin precursor ( AA -66 to 272) 


1521 


100 


438 


gi297546 


Mus musculus 


osteopontin 


1516 


99 


438 


gi50864 


Mus musculus 


T lymphocyte activation protein 


1514 


99 
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NO: 


Database 
entry ID 


Description 


uesiiiis 


1 


PF00204 


Zinc finger C-x8-C-x5-C-x3-H type 
(and similar). 


PF00204 11.59 9.700e-12 426-437 


1 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 3.667e-09 33-42 


2 


BLU0291 


Prion protein. 


tjt AAim A A ACi O Tfn a Aft IOC 

BL00291 A 4.49 o.759e-U9 18S-22U 


3 


PF01105 


emp24/gp25L/p24 family. 


PF01105B 25.12 1.000e-40 178-230 


4 


BL00307 


Legume lectins beta-chain proteins. 


BL00307G 9.91 8.531e-I0 678-689 


4 


PF00922 


Vesiculovirus phosphoprotein. 


PF00922A 19.17 8.862e-09 281-315 


6 


BL01159 


WW/rsp5/WWP domain proteins. 


BL01159 13.85 6.073e-09 61-76 


6 


BL00591 


Glycosyl hydrolases family 10 proteins. 


BL00591G 9.65 9.167e-09 311-323 


7 


BL01159 


WW/rsp5/WWP domain proteins. 


BL01159 13.85 6.073e-09 61-76 


7 


BL00591 


Glycosyl hydrolases family 10 proteins. 


BL00591G9.65 9.167e-09 311-323 


9 


BL00913 


Iron-containing alcohol dehydrogenases 
proteins. 


BL00913D 24.20 8.981e-17 170-204 
BL00913C 7.62 4.375e-ll 136-146 
BL00913B 10.94 7.706e-ll 86-102 


10 


BL00913 


Iron-containing alcohol dehydrogenases 
proteins. 


BL00913D 24.20 8.981e-17 218-252 
BL00913C 7.62 4.375e-ll 184-194 
BL00913B 10.94 7.706e-ll 134-150 


11 


BL50062 


BCL2-like apoptosis inhibitors (spans 
partofBH3, BH1 and BH. 


BL50062C 6.66 8.500e-l 1 349-358 


14 


BL01144 


Ribosomal protein L3 le proteins. 


BL0U44 25.07 9.069e-26 78-130 


15 


PF00204 


Zinc finger C-x8-C-x5-C-x3-H type 
(and similar). 


PF00204 11.59 6.694e-10 355-366 


15 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 4.000e-09 485-535 


15 


BL00415 


Synapsins proteins. 


BL00415N 4.29 6.727e-12 483-527 
BL00415N 4.29 2.774e-09 118-600 
BL00415P 2.37 4.290e-09 819-855 
BL00415Q 2.23 6.534e-09 474-510 


15 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 

• 


PR00049D 0.00 4.500e-14 490-505 
PR00049D 0.00 2.500e-12 489-504 
PR00049D 0.00 4.000e-12 491-506 
PR00049D 0.00 8.201e-ll 488-503 
PR00049D 0.00 1.205e-10 492-507 
PR00049D 0.00 3.746e-09 487-502 

nn AAAJAT\ A A A C nnl f\f\ ACi C C AA 

PR00049D 0.00 5.27le-09 485-500 
PR00049D 0.00 6.644e-09 493-508 


15 


DM00215 


PROLLNE-RICH PROTEIN 3. 


DM00215 19.43 9.022e-13 471-504 
DM00215 19.43 1.458e-09 483-516 
DM00215 19.43 2.678e-09 469-502 
DM00215 19.43 5.424e-09 468-501 
DM00215 19.43 8.017e-09 470-503 
DM00215 19.43 9.085e-09 466^99 
DM00215 19.43 9.237e-09 484-517 


15 


BL01113 


Clq domain proteins. 


BL01113A 17.99 9.308e-09 116-143 


15 


BL00048 


Protamine PI proteins. 


BL00048 6.395.263e-10 196-223 BL00048 
6.39 3.363e-09 262-289 BL00048 6.39 
9.112e-09 184-211 


17 


PR00773 


GRPE PROTEIN SIGNATURE 


PR00773D 16.14 5.922e-09 215-235 
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xcesuiis 


23 


PD00930 


PROTEIN GTPASE DOMAIN 

AUllVAiiUlN. 


PD00930B 33.72 7.300e-26 600-203 
ruuuyjUA Zj.oz l .d i**e-io /-jzj 


23 


BL50002 


Src homology 3 (SID) domain proteins 
profile. 


BL50002A 14.19 4.000e-12 727-746 


23 


PF00182 


GTPase-activator protein for Rho-like 
GTPases 


OT7AA 1 OTD 1 yf *>A 7 ^15- 1 O CyfO IOC 

rrUUlo/B I4.ZU I.555Q-IZ 549-1Z8 


25 


BL00375 


UDP-glycosyltransferases proteins. 


BL00375F 16.99 7.061e-35 291-336 
BL00375C 18.27 2.615e-19 126-150 

"DT f\f\1KT\ A A <<C Q AAA a 1*7 ino OOA 

£>LUU3 /5U 14.DO y.uuue-i / iyz-zzu 
BL00375B 21.22 8.627e-16 67-108 

l5i-«UU-5 foKJ 1J.U1 / /e-13 J7U-4JU 


28 


dLVL 1 /U 


Ribosomal protein Loe proteins. 


■RT AT 1 7A A 1 0 1A O 1 /11<*_AA 110 17^ 


28 


PD01457 


RIBOSOMAL PROTEIN 40S ZINC- 
riNOriiK Mh 1 AL. 


PD01457A 16.51 9.845e-09 67-112 


29 


BL00359 


Ribosomal protein LI 1 proteins. 


BL00359B 23.07 4.231e-24 56-97 
BL00359C 22.18 6.148e-22 111-145 

T5T nAI^OA OA AA A AAAa 01 OA 


29 


BL01108 


Ribosomal protein L24 proteins. 


BL01 108A 20.33 1.000e-08 40-73 


30 


PR00983 


CYME1JN YI^IKJNA C>YJN lriislAisii 
SIGNATURE 


!>1>AAQQ1Tk 1/1 1 d Q OflOa 01 OOA OOO 

rKUuyojjj 14.10 j.zuye-zj z/u-z^z 
PR00983C 11.27 3.415e-21 239-258 
PR00983A 11.10 1.878e-12 75-87 


30 


BL00178 


Aminoacyl-transfer RNA synthetases 
class-I proteins. 


BL00178B 7.11 2.286e-09 314-325 


31 


TVT^ f\f\'~t 1 O 

PR00718 


PHOSPHOLLPASE D SIGNATURE 


PR007 1 8E o.Ol 1 .UUUe-Uo 3z 1 


32 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


"DT AAC1 O 11 1Q C. 1 11,-, 1 A A A CO 

BL00518 12.23 6.133e-10 49-58 


33 


PF00992 


Troponin. 


"DITAAOOI A 1 £ Cn H OH*} ex 1 a i a a<. tjtjaaogo a 
rrUUyyzA lo.o/ /.y/Ze-lU 1U-4D rrUU^yZA 

16.67 5.145e-09 17-52 PF00992A 16.67 

A AO. At* AO 5 A 01 


34 


DT A1 ai o 


AUr-noosyianon iaccors ianmy 
proteins. 


PTAiniQA 11 on 8 AAA^ 11 fA 1AR 
DUJiSJlyJx Ij.ZU o.UUUe-11 Oo-lUo 


34 


PR00449 


TRANSFORMING PROTEIN P21 
RAS SIGNATURE 


PR00449C 17.27 4.938e-20 75-98 
PR00449A 13.20 1.900e-15 34-56 
PR00449E 13.50 6.870e-15 173-196 
PR00449B 14.34 1.360e-10 57-74 

PP ftftAdOn \C\ 70 S "?K4f> (V) 1V7-1S1 

x i\.uw*Htpi_i iu. / y j.jo*tc-v? ijriji 


37 


PR00764 


COMPLEMENT C9 SIGNATURE 


PR00764F 16.89 7.783e-ll 204-225 


j / 


DM01077 


SEX HORMONE-BINDING 
GLOBULIN. 


DM01077A 16 30 1 165e-10 43-90 


37 


BL00279 


Membrane attack complex components 
/perforin proteins. 


BL00279E37.il 9.163e-09 187-235 


38 


PR0O832 


PAXJLLIN SIGNATURE 


PR00832B 9.87 6.284e-10 768-792 


38 


PR00806 


VINCULIN SIGNATURE 


PR008O6A 6.63 9.260e-09 766-777 


38 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.661e-15 766-781 
PR00049D 0.00 3.250e-12 764-779 
PR00049D 0.00 7.277e-ll 765-780 
PR00049D 0.00 8.786e-10 763-778 
PR00049D 0.00 9.390e-09 762-777 


40 


BL00226 


Intermediate filaments Proteins. 


BL00226D 19.10 3.172e-34 397-*44 
BL00226B 23.86 5.929e-23 230-278 1 
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*Results 








BL00226C 13.23 4.808e-21 296-327 
BL00226A 12.77 5.065e-13 129-144 
BL00226B 23.86 6.400e-10 181-229 


41 


BL00243 


Integrins beta chain cysteine-rich 
domain proteins. 


BL002431 31.77 2.014e-09 156-199 
BL002431 31 .77 5.437e-09 159-202 
BL002431 31.77 5.690e-09 30-73 


41 


BL01208 


VWFC domainproteins. 


BL01208B 15.83 5.865e-09 184-199 


41 

- 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 3.670e-ll 66-112 BL00203 
13.94 4.659e-l 140-86 BL00203 13.94 
7.429e-ll 70-116 BL00203 13.94 9.505e-ll 
140-186 BL00203 13.94 2.723e-10 21-67 
BL002O3 13.94 2.723e-10 61-107 BL00203 
13.94 3.147e-10 105-151 BL00203 13.94 
4.064e-10 22-68 BL00203 13.94 5.213e-10 
161-207 BL00203 13.94 6.457e-10 26-72 
BL00203 13.94 7.032e-10 184-230 BL0O203 
13.94 7.223e-10 80-126 BL00203 13.94 
9.043e-10 130-176 BL00203 13.94 1.735e- 
09 175-221 BL00203 13.94 3.020e-09 150- 
196 BL00203 13.94 3.204e-09 65-111 
BL00203 13.94 3.296e-09 95-141 BL00203 
13.94 3.663e-09 135-181 BL00203 13.94 
5.041e-09 47-93 BL00203 13.94 5.041e-09 
85-131 BL00203 13.94 5.500e-O9 100-146 
BL002O3 13.94 5.867e-09 126-172 BL00203 
13.94 5.959e-09 90-136 BL00203 13.94 
6.694e-09 170-216 BL00203 13.94 6.878e- 
09 151-197 BL00203 13.94 6.969e-09 17-63 
BL00203 13.94 7.337e-09 115-161 BL00203 
13.94 7.429e-09 71-117 BL00203 13.94 
7.704e-09 171-217 BL00203 13.94 8.531e- 
09 155-201 BL00203 13.94 8.714e-09 165- 
211 BL00203 13.94 9.265e-09 116-162 


41 


BL00269 


Mammalian defensins proteins. 


BL00269C 16.52 9.289e-09 28-57 
BL00269C 16.52 9.289e-09 72-101 


41 


PD02283 


PROTEIN SPORULATION REPEAT 
PRECU. 


PD02283C 17.54 5.050e-09 138-166 
PD02283C 17.54 5.175e-09 24-52 
rDU/zoJC 17.34 5.1 Oe-Uy OO-VO 
PD02283C 17.54 6.738e-09 1 13-141 
PD02283C 17.54 7.188e-09 163-191 
PD02283C 17.54 7.750e-09 173-201 
PD02283C 17.54 7.975e-09 128-156 
PD02283C 17.54 8.650e-09 148-176 
PD02283C 17.54 9.325e-09 118-146 


41 


BL00799 


Granulins proteins. 


BL00799D 12.41 7.661e-09 49-96 
BL00799G 9.41 1.000e-08 39-80 


43 


BL00291 


Prion protein. 


BL00291A 4.49 4.414e-09 47-82 


44 


PF00084 


Sushi domain proteins (SCR repeat 
proteins. 


PF00084B 9.45 7.188e-10 1549-1561 


44 


BL00142 


Neutral zinc metallopeptidases, zinc- 


BL00142 8.38 2.286e-09 730-741 
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binding region proteins. 




44 


PR00480 


ASTACDM FAMILY SIGNATURE 


PR00480B 15.41 3.314e-09 725-744 


45 


BL00414 


Profilin proteins. 


BL00414D 15.59 9.182e-10 81-108 


48 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 
SIGNATURE 


PR00837D 1 1.12 6.023e-09 22-36 1 


48 


BL01009 


Extracellular proteins SCP/Tpx- 
1/Ag5/PR-1/Sc7 proteins. 


BL01009E 13.50 8.204e-09 21-37 


49 


BL00284 


Serpins proteins. 


BL00284A 15.64 2.350e-20 85-109 
BL00284D 16.34 4.240e-19 323-350 
BL00284C 28.56 5.600e-17 216-258 
BL00284E 19.15 7.500e-14 408-433 
BL00284B 17.99 9.379e-13 189-210 


50 


BL01283 


T-box domain proteins. 


BL01283A24.15 2.125e-39 148-196 
BL01283B 23.17 9.438e-34 208-250 
BL01283D 1 1.70 7.868e-31 298-331 
BL01283C 13.05 8.448e-16 260-274 


50 


PR00937 


T-BOX DOMAIN SIGNATURE 


PR00937A 15.25 9.182e-26 156-181 
PR00937D 13.41 7.375e-17 259-274 
PR00937B 14.58 8.615e-15 223-237 
PR00937E 11.86 8.541e-14 301-315 
PR00937F 12.53 1.450e-12 322-331 
PR00937C 10.51 1.000e-l 1 240-250 


50 


PR00938 


BRACHYURY PROTEIN FAMILY 
SIGNATURE 


PR00938C 8.28 6.547e-09 264-282 


50 


PR00427 


INTERLEUKIN-8 RECEPTOR 
SIGNATURE 


PR00427A 16.30 6.776e-09 416-431 


51 


PD01270 


RECEPTOR FC 
IMMUNOGLOBULIN AFFBST. 


PD01270D 24.66 8.054e-09 50-86 


52 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 2.543e-13 181-221 


52 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 7.682e-ll 150-172 
PR00245C 7.84 5.286e-10 290-306 


52 


PR00237 


RHODOPSIN-LIKE GPCR 
SUPERFAMILY SIGNATURE 


PR00237C 15.69 3.700e-09 195-218 
PR00237G 19.63 8.535e-09 326-353 


53 


PR00050 


COLD SHOCK PROTEIN 
SIGNATURE 


PR00050A 11.28 3.143e-12 42-58 
PR00050C 9.82 9.15 le-11 85-104 


53 


BL00352 


'Cold-shock' DNA-binding domain 
proteins. 


BL00352B 23.66 2.881e-13 71-110 
BL00352A 12.19 1.327e-10 42-57 


56 


BL01173 


Lipolytic enzymes G-D-X-G family, 
bistidine. 


BL01173B 13.27 4.462e-17 140-167 
BL01173C 8.98 4.349e-14 182-196 
BL01173A9.41 1.818e-13 454-467 
BL01173C 8.98 6.553e-13 495-509 
BL01173A 9.41 8.364e-13 107-120 


57 


PR00321 


GAMMA G-PROTEIN 
(TRANSDUCIN) SIGNATURE 


PR00321C 15.39 2.473e-12 123-141 


58 


PR00937 


T-BOX DOMAIN SIGNATURE 


PR00937A 15.25 1.000e-24 117-142 
PR00937D 13.41 5.500e-18 220-235 
PR00937B 14.58 5.235e-13 184-198 
PR00937F 12.53 1.450e-12 293-302 
PR00937E 1 1.86 1.918e-12 259-273 
PR00937C 10.51 3.133e-ll 201-211 
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58 


BL01283 


T-box domain proteins. 


BL01283A 24.15 1.000e-40 109-157 
oLOlzoJB 15.1 1 y.looe-j4 loy-zll 
BL01283C 13.05 8.286e-17 221-235 


58 


PR00938 


BRACHYURY PROTEIN FAMILY 

O m\T A TT TO "C 


PR00938C 8.28 7.384e-09 225-243 


59 


PD02059 


CORE POLYPROTEIN PROTEIN 
uAO CUrM 1 AHNo: Jr. 


PD02059A 28.10 2.694e-09 116-157 


63 


TIT AA1 (\£ 

BL00196 


Ribosomal protein L30 proteins. 


BLOUtyo 34.3o 3.zOUe-10 40-y/ 


64 


BL00226 


Intermediate filaments proteins. 


BL00226B 23.86 1.205e-31 264-312 


64 


BL01305 


moaA / nifB / pqqE family proteins. 


BL01305B 10.95 8.875e-09 78-88 


68 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 6.727e-13 33-67 


69 


PR00874 


FUNGI-IV METALLOTHIONEIN 
SIGNATURE 


PR00874C 4.37 7.214e-10 68-83 


69 


PD00866 


GLYCOPROTEIN PROTEIN SPIKE 
E2 PRECURSOR PEPLOMER 


PD00866L 3.73 6.564e-101-ll PD00866L 
3.73 1.443e-09 26-36 


69 


BL00026 


Chitin recognition or binding domain 
proteins. 


BL00026 12.95 3.013e-09 48-69 


69 


DM01724 


kw ALLERGEN POLLEN CM1 HOL- 
LL 


DM01724 8.14 3.250e-09 10-30 


69 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 6.838e-09 111-126 


69 


BL00243 


Integrins beta chain cysteine-rich 
domain proteins. 


BL002431 31.77 4.838e-10 106-149 
BL002431 31.77 7.221e-10 18-61 BL00243I 
31.77 1.761e-09 41-84 BL002431 31.77 
3.408e-09 31-74 BL002431 31.77 7.465e-09 
71-114 


69 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 4.107e-13 66-1 12 BL00203 
13.94 2.138e-12 92-138 BL00203 13.94 
1.099e-ll 28-74 BL00203 13.94 3.176e-ll 
82-128 BL00203 13.94 3.374e-ll 87-133 
BL00203 13.94 5.846e-ll 77-123 BL00203 
13.94 7.231e-ll 102-148 BL00203 13.94 
1.670e-10 97-143 BL00203 13.94 2.532e-10 
103-149 BL00203 13.94 5.021e-10 88-134 
BL00203 13.94 7.128e-10 38-84 BL00203 
13.94 7.168e-10 107-153 BL00203 13.94 
7.702e-1073-119 BL00203 13.94 9.426e-10 
25-71 BL00203 13.94 1.918e-09 101-147 
BL00203 13.94 2.745e-09 27-73 BL00203 
13.94 4.031e-09 71-1 17 BL00203 13.94 
4.857e-09 36-82 BL00203 13.94 5.041e-09 
98-144 BL00203 13.94 5.l54e-09 6-52 
BL00203 13.94 6.418e-09 76-122 BL00203 
13.94 7.980e-09 91-137 BL00203 13.94 
8.255e-09 13-59 BL00203 13.94 8.898e-09 
48-94 


69 


PR00876 


NEMATODE METALLOTHIONEIN 
SIGNATURE 


PR00876B 7.66 9.514e-09 80-94 


73 


PR00875 


MOLLUSC METALLOTHIONEIN 
SIGNATURE 


PR00875A 5.83 9.679e-10 17-29 
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74 


PR00185 


HISTONE H4 SIGNATURE 


PR00185B 13.68 8.888e-09 364-384 


86 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 7.000e-13 200-213 


50 


TJT AAAOQ 


z,inc linger, Czriz type, domain 
proteins. 


TIT AAAOQ K A7 A Q^Aa 11 9£7 TIT AnAOQ 

16.07 1.900e-10 184-201 BL00028 16.07 
6.100e-10 371-388 BL00028 16.07 6.914e- 
09 317-334 


OO 


nx>r\r\r\AQ 
rKUUU4o 


SIGNATURE 


PPflflflAfiTi A AO 7 1 Cfio_AO 107 0fl7 


87 


PD02870 


RECEPTOR INTERLEUKIN-1 

T>T> C/T TO C/^kTJ 

PRECURSOR. 


PD02870D 15.74 8.468e-09 358-393 


88 


BL00048 


Protarhine PI proteins. 


Ol TIT f\f\f\AO C In Irt Trt #1*7 T>T AAAjlO 

82 BLOU048 6.J9 5.500e-10 70-y/ oL00048 
6.39 2.350e-09 62-89 BL00048 6.39 3.700e- 
09 60-87 BL00048 6.39 5.050e-09 63-90 

"DT f\f\f\AQ C 1Q £L OCQa AO <1 QO DT fifl/V/IB 

dLUUU4o O.oy O.Zooe-Uy Ol-oo iJLUUU4o 
6.39 9.438e-09 71-98 


on 

<>y 




SIGNATURE 


ppftnionr 1 n ni soon** mono 017 

PR00320B 12.19 9.486e-10 202-217 
ppnrrconr n ni 7 Qnnp-AQ oqo-^07 

PR00320A 16.74 8.902e-09 202-217 


OA 




rxvJDr -iypc pcpiiuyi-pruiyi cis-uaiio 
isomerase proteins. 


BL00453A 15.57 1.000e-15 81-96 
BL00453C 9.72 1.000e-12 147-160 


92 


PR00299 


ALPHA CRYSTALLIN SIGNATURE 


PR00299B 17.53 7.211e-09 324-337 


93 


PF00676 


Dehydrogenase El component. 


PF00676D 14.40 4.857e-13 421-441 
PF00676C 16.88 1.931e-10 389-413 
PF00676B 24.71 5.433e-10 192-230 


96 


BL00824 


Elongation factor 1 beta/betaVdelta 
chain proteins. 


BL00824B 9.21 3.919e-09 1472-1492 


99 


PR00417 


PROKARYOTIC DNA 

1 UrUlSUMbKASli 1 oluN A 1 UKJi 


PR00417A 12.66 5.415e-09 866-880 


102 


PD01066 


PROTEIN ZINC FINGER ZINC- 

CTXT/^UD TiAXyVA T T> TKTr\TXT/"** "KTT T 


PD01066 19.43 6.936e-29 17-56 


102 


BL00028 


Zinc ringer, C2H2 type, domain 
proteins. 


BL00028 16.07 2.059e-14 435-452 BL00028 
16.07 7.353e-14 351-368 BL00028 16.07 
2.350e-13 295-312 BL00028 16.07 9.100e- 
13 491-508 BL00028 16.07 2.174e-12 463- 

BL00028 16.07 2.038e-ll 379-396 BL00028 
16.07 2.385e-ll 323-340 BL00028 16.07 
3.423e-ll 239-256 BL00028 16.07 9.654e- 
11 407-424 BL00028 16.07 1.000e-10 267- 
284 


102 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 


BL00479A 19.86 6.362e-09 366-389 


102 


PD02462 


PROTEIN BOLA TRANSCRIPTION 
REGULATION AC. 


PD02462A 22.48 7.695e-09 204-239 


102 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 1.000e-15 460-474 
PR00048A 10.52 1.000e-14 432-446 
PR00048A 10.52 3250e-14 320-334 
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PR00048A 10.52 4.750e-14 348-362 
PR00048A 10.52 6.250e-14 376-390 
PR00048A 10.52 3.133e-13 292-306 
PR00048A 10.52 1.529e-12 488-502 
PR00048B 6.02 l.OOOe-11 336-346 
PR00048B 6.02 9.308e-ll 224-234 
PR00048B 6.02 2.688e-10 476-486 
PR00048B 6.02 3.250e-10 280-290 
PR00048A 10.52 5.696e-10 404-418 
PR00048A 10.52 6.087e-10 264-278 
PR00048B 6.02 6.187e-10 420-430 
PR00048A 10.52 7.214e-10 236-250 
PR00048B 6.02 8.875e-10 364-374 
PR00048B 6.02 3.368e-09 171-181 
PR00048B 6.02 4.316e-09 448-458 


103 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.438e-37 10-49 


103 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 5.500e-13 413-430 BL00028 
16.07 1.000e-12 273-290 BL00028 16.07 
1.783e-12 357-374 BL00028 16.07 7.577e- 
11 301-318 BL00028 16.07 9.308e-l l 441- 
458 BL00028 16.07 9.308e-ll 469-486 
BL00028 16.07 1.300e-10 329-346 


103 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.000e-14 354-368 
PR00048A 10.52 2.286e-13 298-312 
PR00048A 10.52 9.357e-13 270-284 
PR00048A 10.52 3.209e-12 410-424 
PR00O48B 6.02 5.000e-12 286-296 
PR00048B 6.02 l.OOOe-11 342-352 
PR00048B 6.02 l.OOOe-11 370-380 
PR00048B 6.02 1.125e-10 314-324 
PR00048A 10.52 2.565e-10 466-480 
PR00048A 10.52 4.522e-10 438-452 
PR00048B 6.02 1.474e-09 454-464 
PR00048A 10.52 3.520e-09 326-340 
PR00048B 6.02 4.789e-09 482-492 


1 AO 

103 


rDOUOoo 


PROTEIN ZINC-FINGER METAL- 
RTMT1T 


PD00066 13.92 8.200e-l 6 289-302 PD00066 

6.538e-15 373-386 PD00066 13.92 2.800e- i 
14 345-358 PD00066 13.92 4.600e-14 457- 
470 PD00066 13.92 4.130e-l 1 401-414 
PD00066 13.92 9.654e-10 429-442 PD00066 
13.92 5.200e-09 261-274 


103 


BL01024 


Protein phosphatase 2A regulatory 
subunit PR55 proteins. 


BL01024H 13.88 7.353e-09 163-216 


104 


PD01781 


PROTEASE IMMUNOGLOBULIN 
PRECURSO. 


PD01781B 27.55 8.680e-O9 325-369 


105 


PD01781 


PROTEASE IMMUNOGLOBULIN 
PRECURSO. 


PD01781B 27.55 8.680e-09 379-423 


107 


PR00939 


C2HC-TYPE ZINC-FINGER 


PR00939B 13.27 3.209e-09 1302-131 1 
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SIGNATURE 




108 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 2.800e-14 279-292 PD00066 
13.92 4.600e-14 307-320 PD00066 13.92 
1.000e-13 335-348 PD00066 13.92 7.500e- 
13 363-376 


108 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 7.882e-14 319-336 BL00028 
16.07 7.300e-13 347-364 BLO0028 16.07 
4.913e-12 291-308 BL00028 16.07 2.500e- 
10263-280 BL00028 16.07 1.257e-09 375- 
392 


108 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 4.214e-13 288-302 
PR00048B 6.02 5.000e-12 304-314 
PR00048A 10.52 6.824e-12 372-386 
PR00048A 10.52 7.353e-12 344-358 
PR00048A 10.52 7.158e-ll 316-330 
PR00048B 6.02 7231e-ll 276-286 
PR00048B 6.02 1.000e-09 332-342 
PR00048B 6.02 6.211e-09 388-398 


108 


BLOOl 15 


Eukaryotic RNA polymerase II 
heptapeptide repeat proteins. 


BLOOl 15Z 3.12 8.842e-18 96-145 
BLOOl 15Z 3.12 7.144e-17 89-138 
BLOOl 15Z 3.12 6.888e-16 103-152 
BLOOl 15Z 3.12 7.791e-15 82-131 
BLOOl 15Z 3.12 3.947e-14 61-1 10 
BLOOl 15Z 3.12 7.292e-14 117-166 
BLOOl 15Z 3.12 9.164e-14 110-159 
BLOOl 15Z 3.12 1.000e-13 75-124 
BLOOl 15Z 3.12 3.87Ie-13 54-103 
BLOOl 15Z 3.12 6.819e-13 68-117 
BLOOl 15Z 3.12 4.168e-l 1 124-173 
BL00115Z 3.12 9.651e-10 47-96 BL00115Z 
3.12 7.485e-09 71-120 BL00115Z3.12 
9.669e-09 78-127 


109 


PR00193 


MYOSIN HEAVY CHAIN 
SIGNATURE 


PR00193D 14.36 5.680e-33 391-420 
PR00193C 12.60 4.789e-32 156-184 

nnnAi rnn 1 1 *zc\ i azc\^~ i ia n/ 

PR001 93B 1 1 .69 1 .692e-2o 1 1 0-1 36 
rKUUlSoii iyA/ ZOUUe-Zl 443-474 
PR00193A 15.41 4.130e-2054-74 
PR00193F 19 47 5 09 le-1 2 444-473 


110 


BL00239 


Receptor tyrosine kinase class II 
proteins. 


BL00239B 25.15 2.985e-16 67-115 


110 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 8.660e-13 132-151 


110 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 4.462e-25 132-163 
BL00107B 13.31 6.143e-10 197-213 


110 


DM00406 


GLIADIN. 


DM004067.73 1.800e-09 818-831 


110 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 5.596e-09 815-865 


110 


BL00415 


Synapsins proteins. 


BL00415A 6.15 7.684e-09 796-837 


110 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 2.373e-09 801-834 
DM00215 19.43 7.712e-09 797-830 



196 



WO 02/081731 



PCT/US02/01222 



Table 3 



SEQID 
NO: 


Database 
entry ID 


Description 


♦Results 


110 


PR00209 


ALPHA/BETA GLIADEN FAMILY 
SIGNATURE 


PR00209B 4.88 4.188e-09 817-836 
PR00209C 4.56 8.929e-09 790-804 


111 


BL00678 


Trp-Asp (WD) repeat proteins proteins. 


BL00678 9.67 2.800e-10 366-377 BL00678 
9.67 5.263e-09 417-428 BL006789.67 
6.211e-09 186-197 


111 


PR00308 


TYPE I ANTIFREEZE PROTEIN 
SIGNATURE 


PR003O8C 3.83 8.892e-10 108-118 
PR00308C 3.83 8.892e-10 109-119 
PR00308C 3.83 8.364e-09 107-117 


111 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
MUNAIUKc 


PR00320A 16.74 4.000e-13 364-379 
PR00320B 12.19 7.923e-12 415-430 
PR00320A 16.74 5.966e-ll 415-430 
PR00320C 13.01 7.214e-ll 415-430 
PR00320C 13.01 9.217e-ll 364-379 
PR00320A 16.74 9.690e-ll 184-199 

nn AA^IATJ 11 1 Q 1 A^*T« 1 A 1 QA 1 AO 

ri\\J\)DZVD 12.19 3.Uj/e-lU 184-199 
PR00320C 13.01 6.G40e-10 184-199 

rL\\J\JjZ\jD IZ.ly O.OD/e-IU j04-j/9 

PR00320B 12.19 1.450e-09 457-472 

PR00320A 16.74 4.732e-09 457-472 
PR00320A 16 74 6 4RRe-09 281-2Q/5 
PR00320C 13.01 1.000e-08 281-296 


112 


DM00547 


1 kw CHROMO BROMODOMAIN 
SHADOW GLOBAL. 


DM00547F 23 43 2 350e-35 384-431 
DM00547C 17.30 7.000e-19 23-45 
DM00547E 13 94 5 154e-17 135-158 
DM00547D 11.602.750e-13 105-119 


112 


BL00315 


Dehydrins proteins. 


BL00315A 9.35 4.246e-10 1301-1329 


112 


PF00426 


Outer Capsid protein VP4 
(Herrmgglutinin). 


PF00426S 15.67 6.438e-10 1271-1309 


112 


BL00039 


DEAD-box subfamily ATP-dependent 
helicases proteins. 


BL00039D 21.67 6.793e-10 368-414 


112 


PD02191 


I ATP-BINDING NUCLEOSIDE 
TRANSCR. 


PD02191A 13.95 9.036e-10 107-122 


112 


BL00048 


Protamine PI proteins. 


BL00048 6.39 1.900e-09 1257-1284 
BL00048 6.39 5.050e-09 1258-1285 






jjixiyciropyriaine sensitive .u-type 
calcium channel (Beta subuni. 


T>x:(\{\nnA a i a ai n i in Q no i oca 1 ihc 
rrUU/ /4A 10.4/ /.13Ue-U9 IZSU-lizo 

PF00774A 16.47 7.730e-09 1276-1322 1 


112 


BL00115 


Eukarvotic RNA nolvmerase II 

A^UAIU TVUV AVJ. 1 JL A. L/ WJL Y JJUtwl* GLOW 

heptapeptide repeat proteins.. 


BL00115Z 3 12 3 448e-ll 1254-1303 
BL00115Z 3.12 3.302e-10 1261-1310 
BL00115Z 3.12 4.837e-10 1258-1307 
BL00115Z 3.12 7.767e-10 1251-1300 
BL00115Z 3.12 8.167e-10 1263-1312 
BL00115Z 3.12 8.884e-10 1260-1309 09 
1247-1296 BL00115Z 3.122.985e-09 1240- 
1289 BL001 15Z 3. 12 5.632e-09 1265-1314 
BL00115Z 3.12 8.676e-09 1253-1302 
BL00115Z 3.12 9.471e-09 1268-1317 
BL00115Z 3.12 9.735e-09 1257-1306 


112 


PF00186 


Flocculin repeat proteins. 


PF001861 9.10 5.290e-13 1279-1309 
PF001861 9.10 6.838e-12 1277-1307 
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PF001861 9.10 2.957e-ll 1282-1312 
PF00l86I9.l0 7.496e-ll 1276-1306 
PFOOl86I9.l05.200e-lO 1268-1298 
rrOUlool y.lU 7.45Ue-lU 1 2/5-1 3Uo 
PF001861 9.10 7.450e-10 1280-1310 

T>I7AA1 Q#Cf Q 1 A A C/lla AQ n££ IIO/C 

rruUloOi y.iu 4.!>4Je-uy lzoo-izyo 
PF001861 9.10 5.252e-09 1285-1315 

DT7AA 1 Q/TT Q 1A< AO. 1 ^ AO 1 OTO 1 *3 A9 

PF00186I9.10 6.102e-09 1274-1304 

D17AO 1 fi^CT Q 1A7 91X*» AO 1 0TA 1 QflA 

PFftfti rat q ift r fti a*» ftQ i9Ai 1901 

PTTftft 1 RAT 0 1 ft 0 AIIpjTIQ 1 0XO 1 9Q9 

PF001861 9.10 9.433e-09 1267-1297 
PFftftlRATQ 1ft 1 OnftA-AR 195A-19RA 


114 


PR00700 


PROTEIN TYROSINE 
PHOSPHATASE SIGNATURE 


PR00700D 12.47 8.788e-ll 237-256 


114 


BT 00583 

1JJLVUJOJ 


1 jx VoixiC oJJCCU.lt' piUlCJUl pilUopilauloCo 

proteins. 


RT ftft^R^P 1ft ^ 5 397p 1ft 9Aft 9^1 


116 


PR00884 


RIBOSOMAT PROTFTNHSfi 
SIGNATURE 


PROftRRAF 8^4 7^ftp-ftQ AAQ-AAA 


117 


PD02890 


ISOMERASE CHALCONE- 
FLAVONONEFLAV. 


PD02890C 16 14 8 457e-09 200-235 


118 


BL00226 


Intermediate filaments proteins. 


BL00226B 23.86 6.5 13e-10 401^49 


118 


BL00326 


Tropomyosins proteins. 


BL00326D 8.76 1.925e-09 196-237 


118 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 2.678e-09 328-382 

RT ill 1 fiftR 1 Q 54 R 0^9#*-ft0 A5A-7AR 


119 


PD01823 


PROTEIN INTERGENIC REGION 
ABC1 PRECURSOR 
MITOCHONDRION T. 


PD01823C 16.13 7.000e-14 352-373 

Pnft1R70JR 14 OA O 7R9p.1 0 09R.OAR 

PD01823D 16.66 6.857e-10 430-451 


119 


PD01115 


PRECURSOR AMPHIBIAN SKIN 

X IVLyvwiUJv/lv xaXVXX IlUUuVli OXVXl^l 

SIGNAL. 


PD01 1 1 5B 12 92 8 4?1e-ft9 768-987 


122 


BL00854 


Proteasome B-type subimits proteins. 


BL00854C 29.92 8.435e-19 114-143 


124 


BL00651 


Riho^oinal Twotein T 0 TVAtpitw 

XVlUUOUlluU |/lulVJXl 1j7 |/JLl/lGlXld. 


Ttf Aftfi51 A 9^ 95 8 A77a-1 7 OA 1 1A 


125 


BL01245 


RI01/ZK632 3/MJ0444 famtlv 

AVIV X/ z-iXW/JjZt. Jl 1YXJ V/ 1 I'll lOXilXl Jf 

proteins. 


BTii17A5F 18 75 9 17^.93 13A-171 

BL01245A 14.04 8.342e-23 206-231 
BL01245C 13.31 6.564e-15 262-282 
BIj01245E 15 28 1 ftftOe-12 39ft-Hft 
BL01245B 11.91 9.809e-10 245-255 


128 


PR00793 


PROLYL AMINOPEPTIDASE (S33) 
FAMILY SIGNATURE 


PR00793C 12.24 1.333e-09 168-183 


128 


PR00111 


ALPHA/BETA HYDROLASE FOLD 
SIGNATURE 


PR00111C 13.46 6.000e-09 182-196 


129 


BL01160 


Kinesin light chain repeat proteins. 


BL01160D 10.17 7.077e-09 505-534 


129 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 1.000e-08 695-716 


130 


BL00355 


HMG14 and HMG17 proteins. 


BL00355 5.97 8.412e-32 18-49 


130 


PR00925 


NONHISTONE CHROMOSOMAL 
PROTEIN HMG17 FAMILY 
SIGNATURE 


PR00925B 3.73 3.400e-16 34-47 PR00925A 
5.47 1.750e-15 18-33 PR00925C 5.57 
9.824e-09 51-62 


131 i 


PR00041 


CAMP RESPONSE ELEMENT 


PR00041E 7.20 2.976e-13 305-326 
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BINDING (CREB) PROTEIN 
SIGNATURE 




131 


BL00036 


bZIP transcription factors basic domain 
proteins. 


BL00036 9.02 4.103e-09 299-312 


132 


PR00211 


GLUTELIN SIGNATURE 


PR00211B 0.86 1.750e-09 205-226 
PR00211B 0.86 8.750e-09 199-220 


132 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 4.529e-ll 201-234 
DM00215 19 43 1 804e-10 195-228 
DM00215 19 43 2 768e-10 192-225 
DM00215 19.43 4.054e-10 202-235 
DM00215 19.43 6.304e-10 207-240 
DM00215 19.43 7.429e-10 180-213 
DM00215 19.43 8.393e-10 196-229 
DM00215 19.43 8.714e-10 218-251 
DM00215 19.43 6.034e-09 185-218 
DM00215 19.43 6.034e-09 219-252 
DM00215 19.43 6.492e-09 223-256 
DM00215 19.43 7.254e-09 200-233 
DM00215 19.43 9.390e-09 189-222 
DM00215 19.43 9.695e-09 213-246 


133 


BL00455 


Putative AMP-binding domain proteins. 


BL00455 13.31 5.125e-ll 293-309 


133 


PR00154 


AMP-BINDING SIGNATURE 


PR00154A 8.88 6.276e-09 286-298 


136 


PD00015 


GLYCOPROTEIN PRECURSOR 
CELL SI. 


PD00015A 8.90 6.400e-09 243-251 


138 


BL00227 


Tubulin subunits alpha, beta, and 
gamma proteins. 


BL00227B 19.29 1.000e-40 52-107 
BL00227C 25.48 1.000e-40 113-165 
BL00227A 24.55 8.200e-36 1-35 


140 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.377e-13 60-75 PR00049D 
0.00 7.500e-10 63-78 PR00049D0.00 
8.071e-10 61-76 


140 


PR00806 


VINCULIN SIGNATURE 


PR00806B 428 8.440e-09 68-82 


140 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 9.553e-09 60-1 10 


141 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 6.438e-12 1 175-1 190 


141 


BL01187 


Calcium-binding EGF-like domain 
proteins pattern proteins. 


BL01187B 12.04 5.800e-ll 1284-1300 
BL01187B 12.04 8.200e-ll 180-196 


141 


BL01248 


Laminin-type EGF-like (LE) domain 
proteins. 


BL01248 1 1.02 4.343e-12 1362-1375 
BL01248 1 1.02 2.350e-l 1322-335 BL01248 
11.02 4.125e-10 271-284 


141 


PR00764 


COMPLEMENT C9 SIGNATURE 


PR00764B 13.56 3.475e-09 1047-1068 


141 


PR00010 


TYPE H EGF-LKE SIGNATURE 


PR00010C 11.16 4.205e-09 185-196 


141 


BL01113 


Clq domain proteins. 


BL01113A 17.99 5.673e-09 1621-1210 


141 


PR00011 


TYPE HI EGF-LIKE SIGNATURE 


PR00011D 14.03 8.895e-12 551-132 
PR00011B 13.08 5.846e-ll 551-132 
PR00011D 14.03 3.215e-10 313-332 
PR00011A 14.06 4.214e-10 313-332 
PR00011B 13.08 7.783e-10 313-332 
PR00011A 14.06 7.781e-09 551-132 


141 


BL00420 


Speract receptor repeat proteins domain 


BL00420A 20.42 8.200e-09 1186-1215 
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proteins. 




141 


PD02510 


ISOMERASE GALACTOSE-DI- 
PHOSPHATE. 


PD02510B 18.31 8.170e-09 548-144 


141 


PR00261 


LOW DENSITY LP OPROTEIN 
(LDL) RECEPTOR SIGNATURE 


PR00261F 11.57 9.544e-09 1052-1074 


141 


PR00288 


PUROTHIONIN SIGNATURE 


PR00288C 10.15 9.165e-09 311-326 


142 


DM01970 


0kwZK632.12YDR313C 
ENDOSOMAL III. 


DM01970B 8.60 4.750e-17 114-565 


142 


BL01160 


Kinesin light chain repeat proteins. 


BL01 160B 19.54 2.373e-09 203-257 


142 


BL00518 


Zinc fine-er C3HC4 tvoe ("RING 
finger), proteins. 


BL00518 12.23 4 000e-09 559-130 


142 


BL00422 


Granins proteins. 


BL00422E 26.86 8.615e-09 462-498 


143 


PD00066 


PROTEIN ZINC-FINGER METAL- 
B1NDL 


PD00066 13.92 5.846e-15 141-154 PD00066 
13.92 9.217e-ll 551-564 PD00066 13.92 
6 700e-09 523-536 


143 


PR00048 


C2H2-TYPE ZINC FINGER 


PR00048A 10.52 9.526e-ll 122-136 
PR00048A 10 52 2 174e-10 532-546 
PR00048A 10.52 6.087e-10 588-164 
PR00048B 6.02 7.632e-09 138-148 
PR00048A 10.52 8.920e-09 504-518 


143 


PF00651 


BTB fako known as BR-P/Ttki domain 

x*J xu \cuov xuivwu uo ua\ vw x uv i um i imn 

proteins. 


PF00651 15 00 8 920e-09 59-72 


143 


BL00028 


Zinc fin O'er rvne domain 
proteins. 


BL00028 16 07 7 577e-ll 535-114 BL00028 
16.07 2.200e-10 125-142 BL00028 16.07 
5.800e-10 507-524 BL00028 16.07 8.714e- 
09591-170 BL00028 16.07 9.743e-09 444- 
461 


144 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926F 17.75 3;672e-10 262-285 


144 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 7.900e-15 16-41 
BL00215A 15.82 8.147e-14 260-285 
BL00215A 15.82 1.804e-09 166-191 
BL00215B 10.44 5.50Oe-O9 114-127 


144 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927B 14.66 8.644e-09 104-126 


147 


DM01417 


6 kw INDUCING XPMC2 
MUSHROOM SPAC22G7.04. 


DM01417C 12.93 3.250e-ll 267-279 
DM01417D 11.08 2.200e-10 306-322 


148 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 8.378e-10 349^403 


151 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.807e-ll 419-434 
PR00049D 0.00 8.125e-ll 1284-1299 
PR00049D 0.00 3.929e-10 1283-1298 
PR00049D 0.00 3.288e-09 417-432 


151 


BL00904 


Protein prenyltransferases alpha subunit 
repeat proteins proteins. 


BL00904A 8.30 3.553e-09 41^466 


154 


BL00665 


Dihydrodipicolinate synthetase 
proteins. 


BL00665D 14.76 1.000e-ll 109-132 
BL00665C 25.58 5.832e-ll 50-101 


154 


PR00146 


DIHYDRODIPICOLINATE 
SYNTHASE SIGNATURE 


PR00146D 16.26 2.525e-10 108-126 
PR00146A 12.62 8.615e-09 13-35 


156 


PD02906 


SYNTHASE I PSEUDOUPJDYLATE 
PSEUDOURIDINE LYASE TR. 


PD02906C 24.17 9.1 15e-15 171-206 
PD02906B 15.35 4.886e-13 142-155 
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PD02906D 12.27 1.000e-09 239-249 
PD02906A 10.84 8.333e-09 92-105 


157 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107B 13.31 2.286e-ll 396-412 
BL00107A 18.39 6.148e-l 1 332-363 


157 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR001O9B 12.27 4.938e-09 332-35 1 


160 


PF01008 


Initiation factor 2 subunit 


PF01008B 25.59 9.171e-36 366-409 
PF01008A 20.14 8.676e-12 315-336 
PF01008C 12.25 7.382e-10 449-469 


161 


BL00591 


Glycosyl hydrolases family 10 proteins. 


BL00591D 8.33 6.167e-09 2099-21 12 


163 


PR00019 


LEUCINE-RICH REPEAT 
SIGNATURE 


PR00019B 11.36 7.120e-09 99-113 
PR00019B 11.36 7.840e-09 73-87 


164 


BL00198 


Nt-dnaJ domain proteins. 


BL00198A 8.07 3.000e-14 143-160 


164 


PR00187 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00187A 12.84 8.800e-12 139-159 


165 


PR00310 


ANTIPROLIFERATIVE PROTEIN 
BTG1 FAMILY SIGNATURE 


PR00310B 10.59 4.000e-39 41-71 
PR00310C 12.74 2.256e-33 71-101 
PR00310D 9.10 9.820e-33 101-131 
PR00310A 11.17 7.000e-27 16-41 


165 


BL00960 


BTG1 family proteins. 


BL00960B 24.47 1 .000e40 34-79 
BL00960C 12.68 6.745e-21 98-120 
BL00960A 10.98 5.304e-12 14-26 


166 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 2.688e-21 124-174 


166 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973A 21.17 4.162e-10 96-133 


166 


PR00171 


SUGAR TRANSPORTER 
SIGNATURE 


PR00171D 12.76 3.520e-13 456-478 
PR00171E 14.87 2.750e-09 479-492 


166 


PR00172 


GLUCOSE TRANSPORTER 
SIGNATURE 


PR00172D9.13 6.513e-09 456-480 
BL00216B 27.64 5.198e-20 124-174 


167 


BL00216 


Sugar transport proteins. 




167 


DM00973 


3 kw RESISTANCE BENOMYL 
YLL028W CYCLOHEXIMIDE. 


DM00973A 21.17 4.162e-10 96-133 


168 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 5.929e-32 59-98 


168 


PD00066 


PROTEIN ZINC-FINGER METAL- 
B1NDL 


PD00066 13.92 2.385e-15 520-533 PD00066 
13.92 2.800e-14 296-309 PD00066 13.92 
5.200e-14 240-253 PD00066 13.92 5.200e- 
14 548-561 PD00066 13.92 9.400e-14 436- 
449 PD00066 13.92 1.000e-13 324-337 
PD00066 13.92 6.143e-12 352-365 PD00066 
13.92 6.885e-10 268-281 


168 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 6.000e-12 237-247 
PR00048A 10.52 6.294e-12 333-347 
PR00048A 10.52 6.824e-12 361-375 
PR00048A 10.52 9.471e-12 249-263 
PR00048A 10.52 4.316e-U 119-133 
PR00048A 10.52 4.789e-l 1 529-543 
PR00048A 10.52 6.684e-ll 445-459 
PR00048A 10.52 8.141e-l 1 305-319 
PR00048B 6.02 6.063e-10 321-33 1 
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PR00048B 6.02 6.063e-l0 517-527 
PR00048A 10.52 7.261e-10 221-235 
PR00048B 6.02 7.750e-10 545-117 
PR00048B 6.02 1.474e-09 293-303 
PR00048A 10.52 2.800e-09 389-403 
PR00048A 10.52 1.000e-08 417-431 


170 


PR00456 


RIBOSOMAL PROTEIN P2 
•SIGNATURE 


PR00456E 3.06 2.820e-l 1 6-21 PR00456E 
3.06 7.125e-10 3-18 


170 


PD02331 


CYCLIN CELL CYCLE DIVISION 
PROTE. 


PD02331A 19.76 7.429e-15 93-140 
PD02331B 13.43 1.125e-09 174-207 


170 


PR00833 


POLLEN ALLERGEN POA PI 
SIGNATURE 


PR00833H 2.30 5.269e-09 3-18 


171 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 4.706e-14 140-161 
PD00126A 22.53 6.824e-14 289-310 


173 


BL00741 


Guanine-nucleotide dissociation 
stimulators CDC24 family sign. 


BL00741B 14.27 3.418e-ll 294-317 


173 


PR00452 


SH3 DOMAIN SIGNATURE 


PR00452B 11.65 5.154e-ll 86-102 


173 


PR00497 


NEUTROPHIL CYTOSOL FACTOR 
P40 SIGNATURE 


PR00497D 11 91 5 962e-10 91-113 


173 


PF00564 


Octicosapeptide repeat proteins. 


PF00564B 24.74 6.442e-09 277-328 


175 


BL01016 


Glycoprotease family proteins. 


BL01016C 22.84 5.292e-19 60-105 
BL01016H 13 71 6 157e-12 307-317 
BL01016E 14.88 3.182e-ll 141-169 
BL01016D 8.86 6.741e-09 118-131 


175 


PR00789 


O-SIALOGLYCOPROTEIN 
ENDOPEPTIDASE (M22) 

X^A^X^X^X X-fX X ll/i XL/Jm/ f 

METALLO-PROTEASE FAMILY 
SIGNATURE 


PR00789E 12.42 7.128e-14 141-163 
PR00789C 16 11 2 707e-12 85-105 
PR00789B 10.48 1.205e-09 64-85 
PR00789D 8.17 7.151e-09 118-131 


176 


PR00850 


GLYCOSYL HYDROLASE FAMILY 
59 SIGNATURE 


PR00850B 6.67 5.455e-09 148-173 


178 


PR00259 


TRANSMEMBRANE FOUR FAMILY 
SIGNATURE 


PR00259A9.27 8.676e-20 17-41 PR00259C 
16.40 4.750e-17 85-114 PR00259B 14 81 

x v« ■ « / w x # x x r x *w w x* w ^ x*f x * ■ w x 

8.6l5e-12 58-85 PR00259D 13.50 2.528e-ll 
235-262 


178 


BL00421 


Transmembrane 4 family proteins. 


BL00421B 17.62 6.186e-17 64-103 
BL00421A 11.79 6.800e-12 13-32 
BL00421E 20.97 1.514e-10 232-262 
BL00421C 12.89 3.600e-09 147-159 


178 


PR00235 


HERPESVIRUS MAJOR CAPSID 
PROTEIN (MCP) SIGNATURE 


PR00235A 14.64 8.000e-09 87-111 


179 


BL01052 


Calponin family repeat proteins. 


BL01052C 18.51 6.806e-40 87-127 
BL01052A 16.12 7.180e-32 3-35 BL01052B 
15.31 8.031e-26 52-78 BL01052D 10.26 
1.000e-24 174-194 


179 


PR00890 


SMOOTH MUSCLE PROTEIN 22- 
ALPHA (TRANSGEL1N) 
SIGNATURE 


PR00890E 14.34 3.813e-21 135-155 
PR00890A8.61 9.775e-21 34-54 PR00890C 
8.22 1.000e-17 84-98 PR00890B 8.75 
3.455e-17 62-78 PR00890F 12.92 4.064e-14 
161-174 PR00890D 16.17 5.174e-13 118- 
128 
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ues crip non 


ix.es u lis 


17Q 

i iy 


rlxUUooo 


olVlvJvJ 1 rl 1YLU ov^LJ} 

PROTEIN/CALPONIN FAMILY 
SIGNATURE 


DDOACQfiU O 07 ^ KAa 7fl 17< 1Q1 

jrivuuooori yy / D.04e-zu l /j-iyi 
PR00888C 12.27 5.179e-18 52-68 
PR00888D 16.09 4.273e-17 88-105 
PR00888A 11.87 2.350e-16 3-18 PR00888E 

1 1 511 1 A19o 1£ IftA 19ft PPftftRRRP 7 AA 
11. Ol D.HDZ.Q-LO 1U4-1ZU rxtvUooor 7.44 

A 89^-ld. 19^-1 4ft PPftft888fr 19 71 8 7^Q<» 

14 162-176 PR00888B 13.72 2.350e-12 22- 
36 


179 


PR00889 


CALPONIN SIGNATURE 


PR00889E 12.18 2.726e-12 171-187 


loU 


i3LA/Uo / J 


Bacterial type II secretion system 
protein D proteins. 


oiAiUo /da z j.j 1 0.44 /e-w jo /-jyy 


101 


rJJU13M 


NEUROFILAMENT TRIPL. 


P1J01351B 13.72 5.355e-U9 23o-264 


loZ 


JJMU1JD4 


ORF2. 


JJMUl J54ri lo.UU o\oZoe-Z7 109-149 
DM01354G 1 1.57 2.143e-25 78-109 

TYIWTftl 1ZAB 1 A 1 A1 Aa. 1 < AO 7ft 
L)JYlUiJD4r 14.30 1.414e-l j 4Z-/0 

DM01354E 18.69 8.650e-14 17-47 


1 R9 


RT nnR^Q 


ivcnai uipepuoasc proteins. 


rt ftftR/^on i a f»9 i ah* no £7 

DJLAJUoOyiV 14. UZ J.4//e-U7 0/-V0 


185 


BL00039 


DEAD-box subfamily ATP-dependent 
helicases proteins. 


BL00039A 18.44 4,000e-25 222-261 
BL00039D 21.67 4.529e-23 498-544 
BL00039C 15.63 4.300e-16 347-371 

rt ftftftiQR iq iqq i7Q*» 9^9 98Q 
DLjKjwjyD iy, iy y.j § ye-ij zoz-zoo 


185 


PD00302 


PROTEASE POLYPROTEIN 
HYDPOX ASF A<?P 


PD00302B 9.52 1.346e«09 234-250 


186 


PD00066 


PROTEIN ZINC-FINGER METAL- 

RTIMTjT 

XjJLL> .LSI. 


PD00066 13.92 5.714e-12 152-165 PD00066 

1 1 09 (\ 1 A'lf* 19 1 94. 117 


186 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 6.885e-ll 136-153 BL00028 
16.07 2.200e-10 197-214 


lOU 


pp 0.0,9 iq 


MOT T TTQPA>J PTTnr>OPQTM P 

TERMINAL TAIL SIGNATURE 


ppnn9iQ"P 1 ^» 7nc«_no Aon_Ai9 


IRK 


x IvUUU*tO 


p")TJ9 TVP'R TTMr' PTMOnP 

SIGNATURE 


JtK.UUU4oA iu.jz z.yo /e-iu loJ-14/ 
PR00048A 10.52 3.739e-10 194-208 

ppAAfMR A in ^9 ft HAIa in 1/*1 17< 
JrlvUUU^foA 1U.DZ O.U4je-lU lOl-l/D 

PR00048B 6.02 8.105e-09 121-131 


187 

1 o / 


RT /)1 022 


x j. jxxr Jojiuiy pruioivuiigup&puue 
symporters proteins. 


RT ft 1 099R 99 1 0 A 9Aft*» 1 ft Iftft l^A 
DlAj 1 l/ZZD ijf *f.Z*rUe-lU .JUO-JJ'r 


187 


i JVUUDU7 


TMHTRTKI AT PPT A PUATM 
HNXXLDJIN AJLJrrlA V^XX/\JUN 

SIGNATURE 


PPftftAAQR R 97 7 Q1^#»_ftQ 9£A-981 
lr KUUOOS/JD o. Z / f.yLjQ-\jy Z04-Z01 


190 


PR00830 


ENDOPEPTBDASE LA (LON) 
SERINE PROTEASE (S16) 
SIGNATURE 


PR00830A 8.41 3.342e-09 881-901 


191 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 9^34e-13 261-280 


191 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 1.000e-23 261-292 
BL00107B 13.31 1.000e-12 341-357 


191 


BL00239 


Receptor tyrosine kinase class II 
proteins. 


BL00239B 25.15 6.523e-10 196-244 


191 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins 


BL00479C 12.01 1.000e-09 320-333 


191 


PR00834 


HTRA/DEGQ PROTEASE FAMILY 


PR00834F 10.91 2546e-09 786-799 
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SEQID 
NO: 
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entry ID 


Description 


'Results 






SIGNATURE 




193 


BL01033 


Globins profile. 


BL01033A 16.94 2.385e-18 25-47 


tyi 




BFTA HAFMOGT jOBIN 
SIGNATURE 


PR00814A 12 94 1 000e-22 30-47 
PR00814B 9.18 7.750e-18 48-64 






J.V1 I V-/VJ J-/WX-) UN OIVJlNrV 1 U IVL-/ 


PR001 7SB 9 02 9 392e-10 25-49 


1Q4 
ly*t 




G-PROTPTNT BFTA WH4fi RFPFAT 
SIGNATURF 


PR00320B 12 19 6 226e-ll 140-155 
PR00320A 16 74 4 971e-10 140-155 
PR00320C 13.01 9.280e-10 140-155 


194 


BI 00678 


Trn-A^n rWT")^ reneat nTtitetn*; nroteins 


BL00678 9 67 7 632e-09 142-153 


196 


PR00832 


PAXELL1N SIGNATURE 


PR0O832B 9.87 9.174e-10 309-333 


196 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 2.054e-10 376-430 
BL01160B 19.54 6.919e-10 383-437 


196 


PR00049 


WILlYfS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.780e-09 40-55 


196 


BL00087 


Copper/Zinc superoxide dismutase 
proteins. 


BL00087C 20.18 8.784e-09 260-296 


iyo 


JrK.UUoUo 


\ rTXT/^T TT TKT Ol/IXT A TT TD T7 

V1TNUUJJUN MOJNAlUKii 


PPftflRAAA /JAIQ (\%Ae> AO ^nc 




dt nnn/; 
dUJKJIZO 


Tropomyosins proteins. 


PT fVY*9£A 1J.H1 0 lllp HQ *\A(\ 
DLUUjZOA l*f.Ul Sr. i*k3e-Uy jUO-DnU 


197 


PR00674 


LIGHT HARVESTING PROTEIN B 


PR00674A 20.10 7.391e-09 134-155 


1 GO 


DDAfil G9 


r-At 1 UN L/AxrlLNO rKU 1 E/JJN DC 1 A 

SUBUNIT SIGNATURE 


ppnniQ9r t £ *x 9 ^nn*» V7 PpnniQ9n 

8.23 4.462e-36 97-125 PR00192E 8.85 

7 nftlWM 919^939 PP00199A R 9^ 1 474<»- 

27 5-26 PR00192B 6 20 3 000e-26 26-48 


198 


BT/)0231 


r"<^Uii ca£7£/iijg £/i.i/iciu L'C/uj ouuiuui 

proteins. 


BTi)0231 A 8 59 1 000e-40 5-51 BL00231B 
14.16 1.000e-40 84-128 BL00231D 15.40 
1.000e-40 165-200 BL00231E 11.66 l.OOOe- 
40 209-246 BL00231C 12 77 1 180e-15 146- 
157 


199 


PF00023 


Ank reoeat oroteins 


PF00023A 16.03 4.750e-10 45-61 


199 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receotors 


PF00791B 28.49 8.768e-12 87-142 
PF00791B 28.49 7.028e-09 499-1 16 


199 


BL01160 


Kinesin light chain repeat proteins. 


BL01 160E 8.74 7.398e-09 323-362 


201 


PR00239 


MOT LUSC AN RHODOPSIN C- 

xvm\J1~im, i OV^JTu N JXX1.V/J-/V7JC Uii^l 

TERMINAL TAIL SIGNATURE 


PR00239E 1 58 6 114e-09 183-195 


202 


BL00412 


Neuromodulin (GAP-43) proteins. 


BL00412D 16.54 4.033e-10 3 19-370 


202 


BL00224 


Clathrin light chain proteins. 


BL00224B 16.94 4.845e-09 3 13-366 


202 


PF00992 


Troponin. 


PF00992A 16.67 8.734e-12 333-368 
PF00992A 16.67 2.776e-09 344-379 
PF00992A 16.67 5.026e-09 351-386 


203 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL00790R 16.20 7.677e-09 29-73 


204 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL00790R 16^0 7.677e-09 29-73 


205 


BL00790 


Receptor tyrosine kinase class V 
proteins. 


BL00790R 16.20 7.677e-09 29-73 


207 


BL00211 


ABC transporters fernily proteins. 


BL00211B 13.37 3.077e-17 573-167 
BL0021 IB 13.37 7.577e-17 1204-1674 



204 
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^Results 








BL00211A 12.23 1.900e-09 472-484 


207 


PR00478 


PHOSPHOREBULOKINASE FAMILY 
SIGNATURE 


PR00478A 13.44 4.133e-09 474-492 


207 


PR00802 


SERUM ALBUMIN FAMILY 
SIGNATURE 


PR00802G 14.57 7.188e-09 971-994 


207 


PR00836 


SOMATOTROPIN HORMONE 
FAMILY SIGNATURE 


PR00836D 13.05 7.125e-09 1504-1519 


209 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 1.786e-l 0288-303 


210 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
familv 2 nroteins 


BL00972D 22.55 3.348e-ll 388-413 
BL00972E 20 72 4 343e-09 415-437 


210 


PR00198 


ANNEXIN TYPE n SIGNATURE 


PR00198H 12.05 7.750e-09 682-696 


214 


PD00469 


PROTEIN PRECURSOR SIGNAL 
HVDROT A 


PD00469A 13.95 6.400e-O9 73-86 


215 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 8.875e-10 839-855 
PF00023A 16.03 2.286e-09 884-900 




PP (\CY\A1 


P TTP^T T<5 RT f\CiTi OP OT TP PP OTFTM 

JVIlDijUiJ DLAJkJLJ \JiS\J\Jr XSSXJ 1 XiJLI> 

SIGNATURE 




XrJL / 


D1A/U70Z. 


proteins. 




217 


PR00368 


FAD-DFPFNDFNT PYRTDTNTF 
NUCLEOTIDE REDUCTASE 
SIGNATURE 


PR00368C 15 74 8 962e-l 1 326-352 


217 


PR00469 


PYRIDINE NUCLEOTIDE 
DISULPHEDE REDUCTASE CLASS- 
fl SIGNATURE 


PR004691 13.83 7.532e-ll 449-468 
PR00469F 16 51 7.152e-09 322-347 


217 


PD02042 


IRON-SULFUR ELECTRON 
TRANSPORT AROMATIC 
HYDROCARB. 


PD02042B 16.75 5.673e-09 126-141 
PD02042A 21.13 9.045e-09 93-120 


217 


PR00419 


ADRENODOXIN REDUCTASE 
FAMILY SIGNATURE 


PR00419A 14.89 9.486e-09 326-349 
PR00419D 10.62 9.534e-09 327-342 


218 


PF00157 


PDZ domain proteins (Also known as 
DHRorGLGF). 


PF00157 13.40 4.600e-09 688-699 j 


219 


BL00107 


Protein kinases ATP-binding region 
proteins. 


BL00107A 18.39 7.000e-23 65-96 
BL00107B 13.31 4.214e-10 130-146 


219 


PR00109 


TYROSINE KINASE CATALYTIC 
DOMAIN SIGNATURE 


PR00109B 12.27 7.102e-10 65-84 


219 


BL00240 


Receptor tyrosine kinase class TTT 
proteins. 


BL00240E 11.56 5.029e-09 51-89 


220 


PR00239 


MOLLUSCAN RHODOPSIN C- 
TERMINAL TAIL SIGNATURE 


PR00239E 1.58 3.045e-09 38-50 


220 


DM01803 


1 HERPESVIRUS GLYCOPROTEIN 
H. 


DM01803A 10.51 9.349e-09 34-55 


220 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 5.160e-ll 40-55 PR00049D 
0.00 7.807e-ll 41-56 PR00049D O.OO 
8.336e-ll 38-53 PR00049D 0.00 2.286e-10 
42-57 PR00049D 0.00 8.857e-10 33-48 
PR00049D 0.00 2.983e-09 37-52 PR00049D 
0.00 9.847e-09 43-58 


222 


BL00326 


Tropomyosins proteins. 


BL00326A 14.01 5.337e-10 825-859 
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222 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 9.924e-09 516432 


224 


BL00478 


LIM domain proteins. 


BL00478B 14.79 8.527e-09 143-158 


226 


BL00048 


Protamine PI proteins. 


BL00048 6.39 6.063e-09 199-226 


228 


BL00115 


Eukaiyotic RNA polymerase II 


BL00115Z 3.12 5.744e-10 113-162 






heptapeptide repeat proteins. 


BL00115Z 3.12 3.449e-09 120-169 


228 


BL00415 


Synapsins proteins. 


BL00415Q 2.23 8.723e-09 253-289 


229 


BL01161 


Glucosamine/galactosaniine-6- 


BL01161A 19.47 1.000e-40 37-77 






phosphate isomerases proteins. 


BL01161D 28.14 1.000e-40 199-244 








BL01161B 21.37 5.091e-39 117-160 








BL01161C 18.47 1.5Q0e-23 170-199 


231 


PR00269 


PLEIOTROPHIN/MIDKINE FAMILY 


PR00269A 13.91 3.I33e-30 88-1 13 






SIGNATURE 




231 


BL00181 


PTN/MK hepaiin-binding protein 


BL00181A 19.07 4.960e-37.76-112 






family proteins. 


BL00181A 19:07 9.224e-l 8 78-114 


236 


BL00888 


Cyclic nucleotide-binding domain 


BL00888B 14.79 9.069e-13 499-523 






proteins. 




236 


BL00415 


Synapsins proteins. 


BL00415N 4.29 2.774e-09 733-777 


236 


PDQ03Q6 


PROTEIN GLYCOPROTEIN 


PD00306A 10.26 3.133e-09 646-660 






PRECURSOR RE. 




236 


PR00209 


ALPHA/BETA GLIADIN FAMILY 


PR00209B 4 88 3 813e-09 739-758 






SIGNATURE 




236 


DM00668 


7FTN 


DM00668 A 10 20 8 SOOe-09 95R-971 


X. DO 


m m ice 

DLWl A OO 


frWQI fnmiK/ rvrntfMnc 
VJ1N 0 1/ O KJSS.H XoIIlliy piULCIUo. 


RT H1 1RRR Hifii 11 5#Ofi 190 1 SI 








rthi i RRr 1 99 fi^ 4 n#w?£ 1*1 909 








"RTjOI 1RRH R rt9 1 9Q0<O1 91R-955 








RT 01 1RRA 1R R9 71R«-1ft *K-R7 


91Q 


PPflrtQ9Q 
jriwi/yzy 


at wnnir t n<rp nn\AA tm 


PPHft09QR A 1R 5? R7^_fl0 113 ^Rl 






<5TfrNATTTHF? 

Oi\Jl>A X UiVCr 


jri\.uu7^70 j.^o o.5riHe-vy 103-1*+*+ 


9/19 
z*+z 


dt nft9i9 


v^aanenns extracellular repeat proteins 


RT fifi919R 19 70 9 7A*w» 9*\ 5/11 151 
DIAJUZjZD OZ. /y Z. /DOc-ZD J*+1-1D1 






aomain proteins. 


RT AH919R 19 70 R 9#?1*» 99 7AA-R1A 

dujuzjzo oz. /y o.zoje-zz /oo-oi*+ 








RT ftfl919R 19 7Q 9 1Q7j» 91 fH 115 

DJUuuzozjD oz. /y z..yy /e-zi o/-iij 








RT flft919R 19 70 A 1 11<» 10 1 AR1 1 59Q 
BIAHJ/jZd JZ. 4.1 jjc-iy 1**51-IjZ:/ 








RT fifl919R 19 70 1 HAH^ Ifi 1171 1A1Q 








RT H0919R 19 70 9 ££9«» 1ft 1£Q1 1710 








RT HA919R 19 70 5 909*> 1R 19R7 1115 
dLAAjZjZd OZ./y D.ZVZe-lo 1Z0/-1JJD 








RT AH919R 19 70 Q 1 Alt* 1 R 1 1 Aft 1 1 0fi 








RT A0919R 19 7Q 1 9£5*» 17 Qftf\_1fi9ft 
JD1A/UZJZ.D jZ. / y l.ZOje-1/ i/oU-lUZo 








BL00232B 32 79 1 529e- 17 426-474 








BL00232B 32.79 2.588e-17 1084-1132 








BL00232B 32.79 1.386e-16 1184-1232 








BL00232C 10.65 5.390e-12 1369-1387 








BL00232C 10.65 1.391e-ll 204-660 








BL00232C 10.65 2.174e-ll 1584-1164 








BL00232C 10.65 4.522e-ll 1689-1707 








BL00232C 10.65 1.000e-10 65-83 








BL00232C 10.65 4.115e-10 1285-1303 








BL00232B 32.79 7.200e-10 649-697 








BL00232C 10.65 9.827e-10 978-996 








BL00232C 10.65 1.947e-09 170-188 








BL00232B 32.79 2.137e-09 172-220 



206 



WO 02/081731 



PCT/US02/01222 



Table 3 



SEQED 


Database 

t*ntr\r 111 
vim y u/ 
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BL00232C 10.65 4.474e-09 1182-1200 
BL00232C 10 65 8.737e-09 539-119 


243 


BLO0795 


Involucrin proteins. 


BL00795C 17.06 4.977e-10 64-109 
BL00795C 17.06 6.300e-09 55-100 


244 


BL00790 


Receptor tyrosine kinase class V 


BL007901 20.01 7.823e-15 23-54 BL00790I 
20.01 9.400e-ll 310-341 BL007901 20.01 
1.900e-10 117-148 BL007901 20.01 3.893e- 
09215-246 


244 


PR00014 


FBRONECTIN TYPE HI REPEAT 

WtMATTTRF 


PR00014D 12.04 6.400e-ll 30-45 
PR00014D 12 04 6 400e-ll 317-332 
PR00014C 15.44 9.171e-09 204-223 
PR00014D 12 04 1 000e-08 222-237 


OAK 


ttrnniai 

JjX^UUxo D 


T rKi/niiiitiTi-oriniiioti+iTi<j ptitvitipc 
\J Ul^UAlJXl-l^JlJJ UgaulLg CLLC/jr LUGO 

proteins. 


BL00183 28 97 7 037e~10 140-188 


246 


PR00019 


LEUCINE-RICH REPEAT 


PR00019A 11.19 8.800e-12 205-219 
PR00019B 11 36 2 000e-ll 202-216 


247 


BL00214 


Cytosolic fatty-acid binding proteins. 


BL00214B 26.51 7.180e-24 206-251 
BL00214A 21.17 6.250e-22 165-191 


247 


PR00178 


FATTY ACID-BINDING PROTEIN 
STGNATfTOF 

OJ\JxN/\X \JJ\Ef 


PR00178A 15.07 4.913e-21 166-187 
PR00178C 20 54 2 500e-17 226-254 
PR00178D 13.52 6.897e-16 272-291 
PR00178B 10.52 4.900e-10 200-212 






1* TROSOMAT PROTFTN S2 
SIGNATURE 


PR00395C 16 17 2 047e-13 46-64 


248 


BL00962 


Ribosomal protein S2 proteins. 


BL00962C 15.90 2.846e-12 46-64 


240 


Ttf j00227 


Tubulin Qiihiinit<t alnba bptn and 

X UU 111X11 olXL/UlillO alLJlIA) UClOj aUU 

gamma proteins. 


BID0227D 18 46 1 000e-40 74-128 
BL00227F 21.16 1.529e-33 226-280 
BL00227E 24.15 1.409e-26 178-213 


250 


BL00227 


Tubulin Riibunits alnha beta and 
gamma proteins. 


BL00227C 25.48 1.000e-40 39-91 
BL00227D 18.46 1.000e-40 148-202 
BL00227F 21.16 1.529e-33 300-354 
BL00227E 24.15 1.409e-26 252-287 


251 


BL00152 


ATP synthase alpha and beta subunits 
proteins. 


BL00152B 21.40 1.900e-31 191-229 
BL00152A 15.38 5.154e-21 134-160 
BL00152C 11.41 6.250e-12 291-303 


252 


BL00152 


ATP synthase alpha and beta subunits 
proteins. 


BL00152E 22.68 1.000e-32 285-323 
BL00152A 15.38 5.154e-21 134-160 
BL00152C 11.41 6.250e-12 247-259 


253 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 2.200e-ll 54-63 


253 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.029e-09 35-74 


254 


DM00892 


3 RETROVIRAL PROTEINASE. 


DM00892C 23.55 9.739e-12 417-451 


254 


PR00417 


PROKARYOTIC DNA 
TOPOISOMERASE I SIGNATURE 


PR00417A 12.66 8.472e-09 65-79 


255 


BL01052 


Calponin family repeat proteins. 


BL01052C 18.51 1.000e-40 88-128 
BL01052A 16.12 2.875e-35 3-35 BL01052B 
15.31 5.219e-26 52-78 


255 


PR00888 


SMOOTH MUSCLE 
PROTEIN/CALPONIN FAMILY 
SIGNATURE 


PR00888D 16.09 9.112e-19 89-106 
PR00888E 11.81 2.800e-18 105-121 
PR00888F 7.444.600e-18 126-141 
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PROORRRA 11 R7 7 7S0e-1 R V1 R PR00R88C 
1 7 71 7 286e-1 7 52-6R PR00888G 1 2 73 
9 438e-15 163-177 PR00888B 13 72 1 321e- 
14 22-36 


255 


PR00890 


SMOOTH MUSCLE PROTEIN 22- 
ALPHA (TRANSGELIN) 


PR00890E 14.34 1.429e-27 136-156 
PR00890A 8.61 1.000e-26 34-54 PR00890C 
8 22 1 fi00e-19 85-99 PR00890B 8 75 
6.318e-19 62-78 PR00890F 12.92 1.205e-17 
162-175 PR00890D 16 17 1 130e-l3 119- 
129 


257 


BL00745 


Prokaryotic-type class I peptide chain 

rplftncp fflf*tfvr<i sipnsit 

ICLVO&C Id^LUto QlgLU&L. 


BL00745C 13.66 1.000e-40 202-249 
BL00745B 22 56 8 683e-33 148-191 
BL00745D 14.90 8.435e-23 280-303 


259 


BTj00194 


XhinrpHnvin farrvilv nrfiteins 

X IX1\J1\*\X\JA.X11 XCXlXXHj jJA VJUlvlAxiJ. 


BL00194 12 16 7 429e-10 684-697 


260 


BL00612 


Osteonectin domain proteins. 


BL00612E 13.12 3.948e-10 391-436 






1 % rr\rrr» rr1 nVvulin hfnp.1 rpnpaf orntpinc 

nmteifi<j 


BTj00484C 17 01 8 244e-ll 136-151 
BL00484B 9 04 2 145e-10 249-263 
BL00484C 17.01 2.309e-09 269-284 
BL00484B 9.04 8.950e-09 1 16-130 


262 


PR00187 


DNAJ PROTEIN FAMILY 
SIGNATURE 


PR00187A 12.84 2.375e-09 288-308 


262 


BL00198 


Nt-dnaJ domain proteins. 


BL00198A 8.07 3.681e-09 292-309 


262 


BL00157 


Aminotransferases class-V nvridoxal- 

4LtlUlll/UUllMJ>Wi UJVlJ VlUOu ▼ %J J IUVAMA 

phosphate attachment site proteins. 


BL00157A 11.72 8.200e-09 16-26 


263 


PR00320 


G-PROTEIN BETA WD-40 REPEAT 
SIGNATURE 


PR00320B 12.19 2.125e-09 207-222 


263 


PF00913 


Trvoanosome variant surface 

X X T L/UJUvuvAliV T VfcX lw V 1 V ULUM4VV 

glycoprotein. 


PF00913A 7.33 2.500e-09 666-673 


266 


BL01144 


Ribosomal protein L31e proteins. 


BL01 144 25.07 1.000e-40 21-73 


268 


DM00516 


186 DISCOIDIN I N-TERMINAL 


DM00516 30 53 8 168e-13 153-198 


268 


BL00132 


Zinc carboxypeptidases, zinc-binding 


BL00132C 21.35 7.863e-10 307-348 
BL00132A 26 07 8 988e-10 224-265 


268 


PR00765 


CARBOXYPEPTIDASE A 
METALLOPROTEASE (M14) 
FAMILY SIGNATURE 


PR00765B 15.57 7. 171e-12 276-291 
PR00765D 14.16 1.551e-09 420-434 




BL00170 


(^vrlnDhilin-tvne nentiHvl-nrolvl ck- 
trans isomerase sionatur 


BL00170A 17 08 9 018e-09 485-512 


269 


BL00622 


Bacterial regulatory proteins, luxR 
family proteins. 


BL00622 32.69 9.780e-09 11-58 


270 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 1.000e-ll 447-461 
PR00048A 10.52 4.316e-ll 389-403 
PR00048A 10.52 6.684e-ll 362-376 


270 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 3.143e-10 37-50 


270 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 7.000e-10 392-409 BL00028 
16.07 9.100e-10 256-273 BL00028 16.07 
2.286e-09 450-467 BL00028 16.07 8.714e- 
09 365-382 


274 


DM00303 


6 LEA 1 1-MER REPEAT REPEAT. 


DM00303A 13.20 3.310e-09 467-517 


275 


PF00622 


Domain in SPla and the RYanodine 


PF00622B 21.00 9.357e-14 374-396 
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♦Results 






Receptor. 


PF00622C 12.62 1.857e-12 458-472 


275 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 8.800e-ll 44-53 


277 


PF00651 


BTB (also known as BR-C/Ttk) domain 
proteins. 


PF00651 15.00 9.133e-10 65-78 


278 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 8.200e-16 295-308 PD00066 
13.92 8.200e-16 519-532 PD00066 13.92 
1.692e-15 351-364 PD00066 13.92 4.462e- 
15 547-122 PD00066 13.92 4.600e-14 323- 
336 PD00066 13.92 4.600e-14 435-448 
PD00066 13.92 7.000e-14 463-476 PD00066 
13.92 1.500e-13 239-252 PD00066 13.92 
3.143e-12 267-280 PD00066 13.92 3.143e- 
12407-420 PD00066 13.92 8.826e-ll 211- 
224 PD00066 13.92 2.038e-10 491-504 
PD00066 13.92 2.385e-10 379-392 


278 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e-16 444-458 
PR00048A 10.52 6.727e-15 360-374 
PR00048A 10.52 9.182e-15 528-542 
PR00048A 10.52 7.000e-14 472^86 
PR00048A 10.52 7.750e-14 388-402 
PR00048A 10.52 1.000e-13 332-346 
PR00048A 10.52 3.133e-13 304-318 
PR00048A 10.52 4.857e-13 118-132 
PR00048A 10.52 6.786e-13 500-514 
PR00048B 6.02 1.000e-12 292-302 
PR00048A 10.52 8.941e-12 192-206 
PR00048B 6.02 1.000e-ll 348-358 
PR00048A 10.52 1.947e-ll 248-262 
PR00048B 6.02 2.385e-ll 264-274 

ffcir* t\.t\f\ A y aa mm <« 44 r a a has- 

PR00048B 6.02 7.231e-ll 544-116 
PR00048A 10.52 7.632e-ll 416-430 
PR00048B 6.02 8.615e-ll 236-246 
PR00048B 6.02 2.688e-10 516-526 
PR00048B 6.02 4.375e-10 46O470 
PR00048B 6.02 4.375e-10 488-498 
PR00048B 6.02 4.938e-10 404-414 
rjtvuuuH- 00 o.uz o.ucoe-iu jzuo^v 
PR00048A 10.52 7.214e-10 220-234 
PR00048B 6.02 1.947e-09 432^142 
PR00048B 6.02 4.316e-09 572-144 


278 


DM01970 


0 kw ZK632.12 YDR313C 
ENDOSOMAL EL 


DM01970B 8.60 5.012e-09 191-204 


279 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 6.400e-16 449-462 PD00066 
13.92 6.538e-15 504-517 PD00066 13.92 
9.308e-15 421-434 PD00066 13.92 7.000e- 
14 476-489 PD00066 13.92 6.087e-l 1 393- ; 
406 


279 


BL00028 


Zinc finger, C2H2 typo, domain 
proteins. 


BL00028 16.07 2.500e-17 350-367 BL00028 
16.07 5.050e-13 405-422 BL00028 16.07 
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entry ID 
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9. 171e-12 433-450 BL00028 16.07 2.73Ie- 
11 








488-505 BL00028 16.07 3.077e-ll 516-533 

DUU\J\J£0 lU.u/ 0.1UUC"1U J//07H 


279 


PD0746? 


PRfiTFTM ROT A TR ANSPRTPTTflN 
REGULATION AC. 




£ /y 


XjVvvU*tO 


PW9-TVPJ7 7TMP PTMrrPP 

SIGNATURE 


PR00048B 6.02 5.154e-ll 501-511 
PR00048B 6.02 1.000e-10 446-456 
PR00048A 10.52 1.391e-10 513-527 
PR00048A 10.52 2.565e-10 485-499 
PR00048A 10.52 5.696e-10 402416 
PR00048B 6.02 8.875e-10 418-428 
PR00048A 10.52 1.720e-09 430-444 
PR00048B 6.02 3.368e-09 390-400 


285 


BLX)0276 


Channel forming colicins proteins. 


BL00276A 8.87 6.500e-09 257-269 


286 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 2.000e-30 1049 


286 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 6.400e-16 388-401 PD00066 
13.92 3.769e-15 248-261 PD00066 13.92 
9.308e-15 304-317 PD00066 13.92 2.200e- 
14 360-373 PD00066 13.92 2.200e-14 416- 
429 PD00066 13.92 6.400e-14 332-345 . 
PD00066 13.92 1.000e-13 220-233 PD00066 

5.000e-13 276-289 PD00066 13.92 5.500e- 
09 136-149 


286 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.286e-16 260-277 BL00028 
16.07 2.588e-14 288-305 BL00028 16.07 
2.800e-13 400-417 BLO0O28 16.07 6.850e- 
13 120-137 BL00028 16.073.423e-ll 148- 
165 BL00028 16.07 7.923e-ll 344-361 
BL00028 16.07 2.500e-10 204-221 BL00028 
16.07 2.500e-10 428-445 BL00028 16.07 
3.100e-10 316-333 BLO0028 16.07 6.100&- 
10 176-193 BL00028 16.07 1.771e-09 232- 
249 BL00028 16 07 8 200e-09 372-189 


286 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.000e-17 257-271 
PR00048A 10.52 6.727e-15 397-411 
PR00048A 10.52 2.929e-13 285-299 
PR00048A 10.52 9.471e-12 369-383 
PR00048B 6.02 1.000e-ll 329-339 
PR00048A 10.52 1.474e-ll 313-327 
PR00048A 10.52 2.421e-ll 425439 
PR00048B 6.02 3.077e-ll 385-395 
PR00048A 10.52 6.684e-ll 117-131 
PR00048A 10.52 8.141e-ll 201-215 
PR00048A 10.52 1.783e-10 341-355 
PR00048B 6.02 2.125e-10 301-311 
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NO: 


entry ID 


llocnfinfinn 

juc^cripuon 


jtvesuiis 








PR00048B 6.02 2. 125e-10 357-367 
PR00048B 6.02 2.688e-10 217-227 
PR00048A 10.52 3.739e-10 229-243 
PR00048B 6.02 4.938e-10 273-283 
PR00048B 6.02 1.474e-09 245-255 
PR00048A 10.52 2.440e-09 145-159 
PR00048B 6.02 3.842e-09 161-171 
PR00048B 6.02 8.105e-09 441-451 
PR00048B 6.02 9.053e-09 189-199 


287 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 7.407e-23 3-42 


287 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 8.941e-14 269-286 BL00028 
16.07 1.000e-13 549-128 BL00O28 16.07 
2.565e-l2 194-650 BL00028 16.07 6.087e- 
12 241-258 BL00028 16.07 6.870e-12 297- 
314 BL00028 16.07 6.870e-12 381-398 
BL00028 16.07 7.214e-12 493-510 BL00028 
16.07 1.346e-ll 465-482 BL00028 16.07 
1.692e-ll 353-370 BL00028 16.07 3.769e- 
11 325-342 BL00028 16.076.192e-ll 167- 
622 BL00028 16.07 8.962e-ll 213-230 
BL00028 16.07 1.600e-10 409-426 BL00028 
16.07 5.200e-10 185-202 BL00028 16.07 
6.700e-10 577-156 BL00028 16.07 3.057e- 
09 521-538 BL00028 16.076.143e-09 437- 
454 


287 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 9.250e-14 238-252 
PR00048A 10.52 3.209e-12 266-280 
PR00048A 10.52 4.706e-12 490-504 
PR00048A 10.52 5.765e-12 462-476 
PR00048A 10.52 7.882e-12 630-644 
PR00048A 10.52 8.941e-12 518-532 
PR00048A 10.52 9.471e-12 164-178 
PR00048A 10.52 5.737e-ll 378-392 
PR00048A 10.52 7.158e-ll 546-122 
PR00048B 6.02 7.231e-ll 180-190 
PR00048A 10.52 8.141e-ll 210-224 

PR00048A 10.52 9.053e-l 1 406-420 
PR00048A 10.52 3.348e-10 322-336 
PR00048B 6.02 3.813e-10 338-348 
PR00048B 6.02 3.813e-10 394-404 
PR00048B 6.02 3.813e-10 478-488 
PR00048B 6.02 4.938e-10 506-516 
PR00048A 10.52 8.043e-10 434-448 
PR00048B 6.02 8.875e-10 226-236 
PR00048B 6.02 8.875e-10 450-460 
PR00048B 6.02 1 .000e-09 366-376 
PR00048B 6.02 l.OOOe-09 422-432 
PR00048A 10.52 3.520e-09 136-588 
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PR00048B 6.02 7.158e-09 590-600 
PR00048B 6.02 7.632e-09 310-320 
PR00048B 6 02 7 632e-09 124-572 
PR00048A 10.52 9.280e-09 350-364 


289 


PR00070 


DIHYDROFOLATE REDUCTASE 
SIGNATURE 


PR00070C 13 09 6 143e-16 51-63 
PR00070D 11.63 2.929e-15 112-127 


289 


BL00075 


F)ihvHrofr>1ate reductase nroteinR 


BL00075A 27 70 7 900e-16 8-39 BL00075B 
13 49 3 813e-15 51-63 BL00075C 8 51 
2.862e-ll 66-79 BL00075D 5 74 8.105e-10 
113-123 


292 


PR00250 


FUNGAL PHEROMONE MATING 
FACTOR STE2 GPCR SIGNATURE 


PR00250D 14 62 9.163e-09 254-278 


294 


PR00081 


glucose/ribitol 

DEHYDROGENASE FAMILY 
SIGNATURE 


PR00081A 10.53 2.731e-09 39-57 


294 


PR00080 


ALCOHOL DEHYDROGENASE 
SUPERFAJVCLLY SIGNATURE 


PR00080C 17.16 6.464e-ll 191-211 
PR00080A 9.32 9.750&O9 118-130 


295 


PR00806 


VINCUL1N SIGNATURE 


PR00806B 4.28 8.920e-09 276-290 
PR00806B 4.28 9.202e-09 275-289 


296 


PF00992 


Troponin. 


PF00992A 16.67 3.789e-10 553-588 


296 


BL00752 


XPA protein. 


BL00752B 19.17 8.144e-09 130-612 


296 


BL01160 


Kinesin light chain repeat proteins. 


BL01160B 19.54 8.551e-09 536-590 


298 


PR00511 


TEKTDST SIGNATURE 


PR0051 1C 7.86 4.214e-09 371-388 


300 


BL00353 


HMG1/2 oroteins. 


BL00353B 11 479 171e-19 228-278 


301 


PR00240 


ALPHA-1A ADRENERGIC 
RECEPTOR SIGNATURE 

***Wwl»Vl A VI* k/lwl ^A * A wAV-W I 


PR00240C 8.38 3.941e-10 316-336 


302 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 2.200e-ll 54-63 


302 


BL01282 


BIR repeat proteins. 


BL01282B 30.49 2.029e-09 35-74 


305 


PROM <H 


MYOSTN HFAVY CHAIN 
SIGNATURE 


PftOOIQlD 14 ^fi 1 S4Se-^1 190-41 Q 

PR00193C 12.60 1.209e-25 143-171 
PR00193B 1 1.69 2.543e-24 95-121 
PR00193A 15.41 6.885e-19 39-59 
PR00193E 19.47 3.291e-12 444-473 


305 


BL00675 


ATP-binding region A proteins. 




306 


PR00239 


MOLLUSCAN RHODOPSDSf C- 
TERM1NAL TAIL SIGNATURE 


PR00239E 1.58 5.920e-ll 47-59 


306 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.923e-15 140-153 PD00066 
13.92 4.000e-14 112-125 PD00066 13.92 
1.391e-ll 84-97 PD00066 13.92 1.692e-10 
168-181 


306 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.059e-14 96-1 13 BL00028 
16.07 4.130e-12 124-141 BL00028 16.07 
2.385e-ll 68-85 BL00028 16.07 8.269e-ll 
180-197 BL00028 16.07 8.962e-ll 152-169 
BL00028 16.07 9.400e-10 319-336 


306 


PR00799 


ASPARTATE 

AMINOTRANSFERASE 

SIGNATURE 


PR00799D 16.46 5.125e-09 188-214 
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VM*1 J illy 


XJ tOVI »fJ LIU It 




306 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048B 6.02 1.900e-13 81-91 PR00048A 
10.52 3.133e-13 65-79 PR00048A 10.52 
9.357e-13 121-135 PR00048A 10.52 9.357e- 
13 149-163 PROOfURB 6 02 2 fiRRe,10 137- 

147 PR00048A 10.52 4.522e-10 279-293 
PR00048A 10 52 5 696e-10 177-191 
PR00048B 6.02 9.438e-10 109-119 
PR00048A 10 52 3 160e-09 93-107 
PR00048B 6.02 8.105e-09 165475 


307 


PD00015 


GLYCOPROTEIN PRECURSOR 
CELL SI. 


PD00015A 8,90 6.400e-09 35-43 


310 


DM00031 


IMMUNOGLOBULIN V REGION. 


DM00031B 15.41 3.662e-ll 80-114 


311 


BL00824 


Elongation factor 1 beta/beta'/delta 


BL00824C 14.58 1.000e-40 129-167 

BL00824B 9.21 2.080e-21 96-116 
BL00824E 12.49 3.333e-19 210-226 


312 


PR00501 


KELCH REPEAT SIGNATURE 


PR00501B 18.88 7.632e-09 476-491 
PR00501B 18.88 9.763e-09 523-538 


313 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.200e-30 43-82 


313 


PD00066 


PROTEIN ZINC-FINGER METAL-. 


PD00066 13.92 6.500e-13 439-452 PD00066 
13.5/2 S.UUUe-13 355-308 rUWUbo 13.92 
1.000e-12 383-396 PD00066 13.92 4.000e- 
12 327-340 PD00066 13.92 5.714e-12 41 1- 
424 PD00066 13.92 8.435e-ll 299-31213.92 
5.800e-14 467-480 PD00066 


313 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.565e-12 451-468 BL00028 
16.07 2.957e-12 311-328 BL00028 16.07 
3.348e-12 367-384 BL00028 16.07 1.692e- 
11 423-440 BL00028 16.07 2.73 le-11 283- 
300 BL00028 16.07 2.800e-10 339-356 
BL00028 16.07 9.700e-10 199-216 BL00028 
16.07 1.000e-09 395-412 BL00028 16.07 
a ac£a no i on 1 in 


313 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 5.909e-15 364-378 
PR00048A 10.52 2.286e-13 308-322 

PR00048A 10.52 6.824e-12 448-462 
PR00048A 10.52 2.421e-ll 196-210 
PR00048A 10.52 1.000e-10 280-294 
PR00048B 6.02 3.813e-10 324-334 
PR00048B 6.02 4.375e-l0 464-474 
PR00048A 10.52 6.870e-10 336-350 
PR00048A 10.52 7.214e-10 420-434 
PR00048B 6.02 7.750e-10 436446 
PR00048B 6.02 4.316e-09 380-390 


314 


PR00121 


SODIUIVI^OTASSIUM- 
TRANSPORTING ATPASE 
SIGNATURE 


PR00121D 16.72 1.577e-13 210-232 


314 


PR00119 


P-TYPE CATION-TRANSPORTING 


PR00119B 13.94 9. 194e-12 217-232 
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D esc rintinn 


*R pen lie 






ATPASE SUPERFAMILY 
SIGNATURE 




314 


BL01228 


Hypothetical cof family proteins. 


BL01228D 17.44 3.400e-ll 646-671 


314 

•JIT 


BL001 <54 


Pup / A'l 'pocpc r\VirvcTtTinT\/1 q tinn cit** 
jjjL~i.j£, Airdoca piiubpiiuiyiduuii oiic 

proteins. 


BL00154C 12.38 4:060e-12 213-232 
RTjOOI ^4F R 01 9 SQ7e-1 1 207-nfiQ 


315 


BL00888 


Cyclic nucleotide-binding domain 
proteins. 


BL00888B 14.79 1.692e-10 396-420 


315 


BL00420 


Speract receptor repeat proteins domain 
proteins. 


BL00420A 20.42 8.338e-09 215-682 


315 


DM00668 


ZEIN. 


DM00668A 10.20 8.500e-09 155-170 


316 


PR00727 


BACTERIAL LEADER PEPTIDASE 1 
(S26) FAMILY SIGNATURE 


PR00727C 13.04 9.063e-16 108-128 
PR00727B 12.51 7.848e-ll 81-94 


316 


BL0O5O1 


Signal peptidases I serine proteins. 


BL00501D 16.69 2.884e-13 108-128 
BL00501C9.61 9.561e-ll 81-93 BL00501B 
12.58 7.000e-09 61-77 


317 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 9.471e-27 13-52 


317 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 5.235e-14 214-231 BL00028 
16.07 6.850e-13 270-287 BL00028 16.07 
9.100e-13 354-371 BL00028 16.07 1.391e- 
12 158-175 BL00028 16.07 1.346e-ll 298- 
315 BL00028 16.07 3.769e-ll 242-259 
BL00028 16.07 6.538e-ll 380-397 BL00028 
16.07 8.800e-10 186-203 BL00028 16.07 
1.5 14e-09 326-343 


317 


PR00048 


C2H2-TYPE ZINC FINGER 

OTPM A TT TO T? 

SIGNATURE 


PR00048B 6.02 3.000e-12 199-209 
PR00048A 10.52 7.882e-12 351-365 
PR00048A 10.52 8.412e-12 323-337 
PR00048A 10.52 8.941e-12 239-253 
PR00048A 10.52 1.474e-ll 211-225 

tjt> f\f\f\A OA 1 A CO /■ 1 „ 11 i C C i Crt 

PKu004oA 10.52 o.211e-ll 155-169 

t>t> nnriA on £ no n oo 1« 11 n 1 11 1 
JrKUUU4oI3 O.UZ /.z31e-ll 311-321 

PR00048A 10.52 8.141e-ll 267-281 

rKUUWolJ O.UZ J.ZDUe-lU 33y-34y 

PR00048B 6.02 3.813e-10 255-265 

PR00048B 6.02 3.842e-09 171-181 
PR00048B 6.02 3.842e-09 393^03 
PR00048A 10.52 8.200e-09 295-309 


319 


PR00004 


ANAPHYLATOXIN DOMAIN 
SIGNATURE 


PR00004C 12.46 8.141e-09 91-103 


320 


DM00060 


338 kw NEUREXIN ALPHA HI 
CYSTEINE. 


DM00060 6.92 6.500e-l 1 28-38 


320 


PR00010 


TYPE H EGF-LIKE SIGNATURE 


PR00010C 11.16 7.667e-ll 44-55 


325 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020A 18.17 5.776e-12 344-363 
PR00020C 13.66 6.932e-10 417-429 


325 


BL00740 


MAM domain proteins. 


BL00740A 13.87 8.313e-12 346-359 
BL00740B 19.76 8.500e-09 486-507 


325 


PD02080 


T-CELL GLYCOPROTEIN CDS 


PD02080B 20.69 9.62le-09 123-162 
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CHAIN SURFACE ALPHA PRE. 




326 


BL00048 


Protamine PI proteins. 


BL00048 6.39 6.128e-10 167494 


326 


PF0U40 


Matrix protein (MA), pl5. 


PF01140D 15.54 9.79 le-09 220-255 


327 


PR00020 


MAM DOMAIN SIGNATURE 


PR00020C 13.66 2.615e-ll 143-593 
PR00020B 15.52 5.059e-10 52-69 
PR00020B 15.52 1.789e-09 553-132 


329 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NIL 


PD01066 19.43 9.357e-32 8-47 


329 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 3.209e-14 284-301 BL00028 
16.07 4.600e-13 508-525 BL00028 16.07 
6.400e-13 368-385 BL00028 16.07 4.1 15e- 
11396-413 BL00028 16.07 4.ll5e-ll 424- 
441 BL00028 16.07 8.269e-l 1 172-189 
BL00028 16.07 8.962e-ll 256-273 BL00028 
16.07 9.308e-ll 312-329 BL00028 16.07 
9.654e-ll 200-217 BL00028 16.07 3.100e- 
10 340-357 BL00028 16.07 5.500e-10 452- 
469 BL00028 16.07 9. 100e-l0 480-497 
BL00028 16.07 4.086e-09 228-245 


329 


PD00066 


PROTEIN ZINC-FINGER METAL- 

TJ1XTTM 


PD00066 13.92 7.000e-14 272-285 PD00066 
13.92 5.000e-13 328-341 PD00066 13.92 
5.500e-13 188-201 PD00066 13.92 5.500e- 

509 PD00066 13.92 6.143e-12 468-481 
PD00066 13.92 2.731e-10 440-453 PD00066 
13.92 4.808e-10 160-173 PD00066 13.92 
5.500e-10 244-257 PD00066 13.92 7.000e- 
09 216-229 PD00066 13.92 7.000e-09 412- 
425 


332 


PD02870 


RECEPTOR INTERLEUKIN-1 
PRECURSOR. 


PD02870B 18.83 5.871e-ll 468-501 


332 


PR00019 


LEUCINE-RICH REPEAT 

OTn\T A TT TD T3 


PR00019A 1 1.19 8.043e-10 275-289 


332 


BL00240 


Receptor tyrosine kinase class III 
proteins. 


BL00240B 24.70 4.447e-09 430454 


333 


BL00738 


S-adenosyl-L-homocysteine hydrolase 
proteins. 


BL00738J 18.61 l.OOOe-40 154-204 

UUJU/ioii 23. Uo j.iZUeoO 4ooozi 
RT jfl07^RF 12 7 261e-29 387-419 
BL00738A 16.27 9.660e-27 216-256 
BL00738C 16.53 7.923e-25 281-319 
BL00738G 14.29 6J268e-23 446-468 
BL00738B 12.28 8.085e-21 256-281 
BL00738E 14.18 9.200e-19 361-384 
BL007381 14.57 5.135e-17 545-583 
BL00738D 7.16 5.109e-13 335-350 


333 


BL00836 


Alanine dehydrogenase & pyridine 
nucleotide transhydrogenase. 


BL00836D 22.30 8.622e-09 424-461 


337 


PR00425 


BRADYKININ RECEPTOR 
SIGNATURE 


PR00425C 13.23 3.148e-09 80-100 


342 


PD01823 


PROTEIN INTERGENIC REGION 


PD01823E9.30 6.824e-12 108-121 
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ABC1 PRECURSOR 


PD01823D 16.66 1.265e-09 46-67 






MITOCHONDRION T. 




343 


PR00976 


RIBOSOMAL PROTEIN S21 
FAMILY SIGNATURE. 


PR00976C 10.41 2.837e-09 396-407 


343 


DM00215 


PROLINE-RICH PROTEIN 3 


DM00215 19.43 1.458e-09 473-506 
DM00215 19.43 4.8 14e-09 463-496 


343 


PR00671 


INHIBIN BETA B CHAIN 
SIGNATURE 


PR00671C 4.18 9.172e-09 707-727 


343 


PD01234 


PROTEIN NUCLEAR 
BROMODOMATN TRANS 


PD01234B 15.53 1.000e-08 482-500 


344 


PR00175 


MYOGLOBIN SIGNATURE 


PR00175B 9.02 2.143e-10 25-49 


-1AA 

J*f*T 




XjLj X -T\ x\LrVGiVX\JVJX-i\^XJXXN 

SIGNATURE 


PR00R14C 9 20 6 S21e-10 


344 


PR00173 


ERYTHROCRUORIN FAMILY 


PR00173A 15.91 7.158e-10 25-48 


344 


BL01033 


Globins profile. 


BL01033A 16.94 l.OOOe-1625^7 






RT 01 013R 13 81 8 61 5e-09 87.99 


344 


PR00612 


ALPHA HAEMOGLOBIN 
SIGNATURE 


PR00612E 9.04 4.194e-12 122-139 
PR00612B 10.92 3.483e-10 32-43 
PROOfil 2Ti 9 76 9 d/?8e-09 74-88 


345 


PR00814 


BETA HAEMOGLOBIN 
SIGNATURE 


PR00814C 9.20 6.523e-10 104-122 


345 


BL01033 


Globins profile. 


BL01033A 16.94 5.125e-10 63-85 
BL01033B 13.81 8.615e-09 125-137 


345 


PR00612 


ALPHA HAEMOGLOBIN 
SIGNATURE 


PR00612E 9.04 4.194e-12 160-177 
PR00612B 10.92 3.483e-10 70-81 
PR00612D 9.76 9.438e-09 112-126 


349 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 6.133e-32 6-45 


350 


BL00972 


Ubiquitin carboxyl-tenninal hydrolases 
family 2 proteins. 


BL00972A 11.93 6.318e-19 364-382 . 

HT fWxQ70F\ 00 <^ 7 Q£Qt* 1 £ 01 A_^71 
DiJJyjy /ZD ZZ.OD /.yOoe-10 ZllM) / j 

TIT AAQTOTl 0 AS 1 10 AAK ASK 


350 


PR00049 


WILMS TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.008e-13 121-136 
PR00049D 0.00 7.375e-12 125-140 
PR00049D 0.005.916e-ll 128-143 

pp/WlAQTi (\ nn a 7ar*» i i i oo i T7 
rxvvUU^Ti'jj i/.uu o. /*foe-i i izz-i j / 

punnndon n nn q ^q^«-i i iofi-141 
PR00049D 0.00 1.286e-10 119-134 
PR00049DO.OO 8.929e-10 127-142 
PR00049D 0.00 2.678e-09 129-144 
PR00049D 0.00 4.051e-09 123-138 
PR00049D 0.00 4.051e-09 124-139 
PR00049D 0.004.051e-09 130-145 


350 


PR00211 


GLUTELIN SIGNATURE 


PR00211B 0.86 7.500e-09 124-145 


350 


DM00215 


PJIOLINE-RICH PROTEIN 3. 


DM00215 19.43 5.339e-10 108-141 
DM00215 19.43 7.268&-10 112-145 
DM00215 19.43 2.525e-09 106-139 
DM00215 19.43 9.695e-09 107-140 


350 


BL00048 


Protamine PI proteins. 


BL00048 6.39 9.888e-09 145-172 


352 


BL00518 


Zinc finger, C3HC4 type (RING 


BL00518 12.23 4.429e-10 214-223 
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finger), proteins. 




353 


BL00518 


Zinc fineer C3HC4 tvoe (RING 
finger), proteins. 


BL00518 12 23 4 429e-I0 179-188 


354 


BL01009 


Extracellular proteins SCP/Tpx- 
1/Ag5/PR-1/Sc7 proteins. 


BL01009D 14.19 9.341e-17 160-181 
BL01009A 1 3.75 3.769e-14 80-98 
RIHIftflQP \% SO 5 lTte-14 1Q4-91A 

BL01009C 10.54 2.667e-ll 127-141 


354 


PR00838 


VENOM ALLERGEN 5 SIGNATURE 


PR00838G 16.07 2.304e-14 158-178 
PR00838D 8.73 4.452e-12 80-99 PR00838F 
ion 7 ^?<»-in 1^^.141 

lv.ll /.JJ4,G-L\J l£J~LHl 


354 


PR00837 


ALLERGEN V5/TPX-1 FAMILY 


PR00837C 17.21 7.429e-18 159-176 

PT>nnft^7A \a 77 i onnp h qo 
rRMVojt /a i*f. / / i.yuue-13 ou-yy 

PR0G8T7B 11 64 3 483e-09 197-141 


356 


BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 8.500e-17 16-41 
BID0215B 10 44 4 900e-09 177-190 
BL00215A 15.82 6.786e-09 133-158 
BL00215B 10.44 7.300e-09 278-291 


356 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926E 1 1 70 6 049e-13 91-1 10 
PR00926F 17.75 7.600e-ll 240-263 
PR00926F 17.75 5.219e-10 18-41 PR00926D 
10.53 7.387e-09 246-265 


357 


PR00326 


GTP1/OBG GTP-BINDING PROTEIN 
FAMILY SIGNATURE 


PR00326A 8.75 7.150e-l 1 21-42 


357 


BL00113 


Adenylate kinase proteins. 


BL00113A 12.74 6.677e-09 22-39 


357 


BL01128 


Shikimate kinase proteins. 


BL01 128A 18.84 7.802e-09 21-55 


357 


BL00300 


SRP54-type proteins GTP-binding 
domain proteins. 


BL00300B 20.56 1.000e-08 18-64 


358 


BL00972 


Ubiquitin carboxyl-terminal hydrolases 
femily 2 proteins. 


BL00972A 11.93 6.318e-19 324-342 
BL00972D 22.55 3.903e-16 170-194 
BL00972B 9.45 1.600e-12 405-415 


364 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 1.482e-10 355-388 


364 


PR00217 


43 KD POSTSYNAPTIC PROTEIN 
SIGNATURE 


PR00217C 10.91 4.600e-10 302-318 


365 


BL00518 


Zinc finger, C3HC4 type (RING 
finger), proteins. 


BL00518 12.23 2.800e-ll 125-134 


365 


BL00415 


Synapsins proteins. 


BL00415N 4.29 2.839e-09 387-431 


365 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 7.706e-ll 377-410 
DM00215 19.43 8.412e-l 1 333-366 
DM00215 19.43 2.678e-09 356-389 
DM00215 19.43 5.138e-09 376-409 


365 


BL01102 


Prokaryotic dks A/traR C4-type zinc 
finger. 


BL01102 15.99 5.705e-09 109-135 


365 


PR00211 


GLUTELIN SIGNATURE 


PR00211B 0.86 5.959e-ll 407-428 
PR00211B 0.86 2.212e-10 401-422 
PR00211B 0.86 9.500e-09 336-357 


365 


PR00049 


WBLM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.695e-09 335-350 


367 


PD00126 


PROTEIN REPEAT DOMAIN TPR 
NUCLEA. 


PD00126A 22.53 8.448e-09 2-23 
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370 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00O28 16.07 7.353e-14 157-174 BL00028 
16.07 1.000e-13 269-286 BL00028 16.07 
8.200e-13 493-510 BL00028 16.07 3.739e- 
12213-230 BL00028 16.07 6.478e-12 381- 
398 BL00028 16.07 1.346e-ll 185-202 
BL00028 16.07 2.385e-ll 129-146 BL00028 
16.07 2.385e-ll 325-342 BL00028 16.07 
5.154e-ll 241-258 BL00028 16.07 9.654e- 
11 437-454 BL00028 16.07 1.300e-10 297- 
lid pt finnix i/? A7 0 i nn^ 1 a a no ak 

BL00028 16.07 9.100e-10 465-482 


lift 


rJJUUUOO 


r ssSJ 1 rSliN LASS ^-rliNvjrtilv iVUl I /VLr 

BINDI. 


13.92 3.077e-15 145-158 PD00066 13.92 
8.800e-14 173-186 PD00066 13.92 3.500e- 
13 369-382 PD00066 13.92 8.500e-13 341- 
354 PD00066 13.92 9.133e-12 397-410 
PD00066 13.92 2.174e-ll 313-326 PD00066 
13.92 3.348e-ll 453-466 PD00066 13.92 
3.739e-l 1481-494 PD00066 13.92 7.2 14e- 
11 257-270 PD00066 13.92 2.038e-10 425- 
438 PD00066 13.92 6.538e-10 201-214 
PD00066 13.92 5.200e-09 285-298 


370 


DM01970 


0 kw ZK632. 12 YDR313C 
ENDOSOMALin. 


DM01970B 8.60 6.201e-09 265-278 


370 


PR00048 


C2H2-TYPE ZINC FINGER 

CTlTTvT A TT TP T7 


PR00048A 10.52 1.474e-ll 462-476 

PPfiAPl/iftA 1A <0 A &QAt* 11 IM 10/C 

PR00048A 10.52 2.957e-10 434448 

rSSXJKIKJ'rOD O.l/Z J.JVA/C-IU JOO-JHo 

PR00048A 10.52 6.478e-10 350-364 
PR00048B 6 02 6 187e-10 226-236 
PR00048A 10.52 6.870e-10 490-504 
PR00048A 10 52 8 826e-l0 406-420 
PR00048B 6.02 3.842e-09 170-180 
PR00048B 6.02 4.316e-09 366-376 
PR00048B 6.02 4.789e-09 478-488 
PR00048B 6.02 7.632e-09 142-152 
PR00048A 10.52 8.122e-09 126-140 
PR00048B 6.02 9.053e-09 450-460 


371 


BL01019 


ADP-ribosylation factors family 
proteins. 


BL01019B 19.49 6.276e-21 95-150 
BL01019A 13.20 8.453e-17 51-91 


371 


PR00328 


GTP-BINDING SARI PROTEIN 
SIGNATURE 


PR00328C 13.16 8.481e-13 78-104 
PRO0328D 12.56 3357e-ll 123-145 


371 


BL01115 


GTP-binding nuclear protein ran 
proteins. 


BL01115A 10.22 8.119e-ll 21-65 


373 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 4.522e-12 208-225 


373 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 7.000e-13 194-207 PD00066 
13.92 7.000e-13 224-237 PD00066 13.92 
7.000e-12 254-267 


373 


PR00048 


C2H2-TYPE ZINC FINGER 


PR00048A 10.52 1.391e-10 205-219 
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SIGNATURE 


PR00048B 6.02 6.063e-10 221-231 


374 


PR00308 


TYPE I ANTIFRFF7.K PROTEIN 
SIGNATURE 


PR00308A 5.90 7.288e-ll 533-548 
PR00308A 5.90 8.835e-09 534-549 


377 


PD02784 


PROTEIN NUCLEAR 
REBONUCLEOPROTEIN. 


PD02784B 26.46 7.538e-09 147-190 


378 


PD01351 


PROTEIN REPEAT 
NEUROFILAMENT TRIPL. 


PD01351A8.69 7.469e-09 155-166 


380 


PF00094 


von Willebrand factor type D domain 
proteins. 


PF00094C 12.88 1. 91 8e-09 43-53 


380 


BL01208 


VWFC domain proteins. 


BL01208B 15.83 3.667e-l 1 120-135 
BL01208B 15.83 1.973e-09 178-193 


380 


PD02138 


PRECURSOR GLYCOPROTEIN 
SIGNAL CELL, 


PD021 38 A 27 60 9 057e-09 20-69 


381 


BL01105 

JLJJLA7 X I. \J*J 


xxiuuduixiai. |ji\jiviii LijjjVv |jimjpjmo. 


RT 01 10SR 12 9S 7 910e-13 41-R3 


384 


PR00049 


WHM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D0.009205e^lO 10-25 PR00049D 
0.00 1.915e-09 9-24 


385 


BLOll 15 

JJljvi X Ml J 


OTP-Kinrfino' mi^l^ar nmfpfn ran 

VJ A X UJu.H-lii.lg UUL/lC-dl LJ1 U IC1X1 X Oil 

nrnteins 

JJ1 v CV/iiiO ■ 




385 


BL00905 


GTP1/OBG family proteins. 


BL00905D 15.00 5.313e-09 140-155 


385 


PR00449 

A A VU w r ■ ^ 


TRANSFORMING PROTEIN P21 

X XVIkl 1 UX V^XVJLVXXX 1 VJ X 1W X JJU ^ X X 

RAS SIGNATURE 


PR00449C 17 27 3 209e-l9 75-98 
PR00449A 13.20 1.000e-17 34-56 
PR00449D 10 79 3 368e-13 139-153 
PR00449B 14.34 8.364e-ll 57-74 PR00449E 
13.50 8.286e-09 174-197 


386 


BL00115 


Enkaryotic RNA polymerase 11 
heptapeptide repeat proteins. 


BL001 15Z 3.12 7.977e-10 397-446 


386 


PR00041 


CAMP RESPONSE ELEMENT 
BINDING (CREB) PROTEIN 
SIGNATURE 


PR00041F 8.53 9.365e-09 256-274 


388 


PF00646 


F-box domain proteins. 


PF00646A 14.37 9.036e-10 28-42 


389 


BL00036 


bZIP transcription factors basic domain 
proteins. 


BL00036 9.02 6.294e-12 81-94 


389 


PR00042 


FOS TRANSFORMING PROTEIN 
SIGNATURE 


PR00042C 8.29 8.105e-13 82-99 PR00042D 
8.97 9.895e-10 100-122 


389 


BL00224 


Clathrin light chain proteins. 


BL00224B 16 94 3 373e-09 70-123 


389 


PR00043 


JUN TRANSCRIPTION FACTOR 
SIGNATURE 


PR00043B 8.73 9.596e-09 81-98 


390 


PF00622 


Domain in SPla and the RYanodine 
Receptor. 


PF00622B 21.00 2.50Qe-13 85-107 


391 


BL00564 


Argininosuccinate synthase proteins. 


BL00564A 19.93 6.114e-09 7-44 


392 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 7.750e-14 230-244 
PR00048A 10.52 4.316e-ll 202-216 


392 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.125e-15 205-222 BL00028 
16.07 1.391e-12 233-250 BL00028 16.07 
3.400e-10 177-194 


392 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDI. 


PD00066 13.92 3.000e-13 193-206 PD00066 
13.92 3.423e-10 221-234 


393 


PF00622 


Domain in SPla and the RYanodine 
Receptor. 


PF00622B 21.00 l.391e-16 132-154 
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393 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 8.800e-10 761-778 BL00028 
16.07 2.029e-09 789-806 




pp 00048 


C9W9-TYPF 7TNC FTNGFR 
SIGNATURE 


PR 00048 A 10 52 2 R00e-09 758-772 


^04 


PP00501 




PP 00501 A R?5 1 40Qp-0Q 5^7-551 


394 


DM00099 


4 kw A55R REDUCTASE 

1 J1JKJYUIN AL. JJlrl i LJtsXJr 1 JiivLLIilNIl 


DM00099B 14.73 4.375e-09 415-425 


395 


PR00399 


S YNAPTOTAGMIN SIGNATURE 


PR00399A9.52 3.133e-19 146-162 

PR00399B 14.27 7.750e-16 161-175 
PR00399D 14.48 4.000e-14 242-253 




PPOOl/SO 




PP003£OT* 1 8 9fiQ*> 1 ^ 901 -915 

PR00360A 14.59 2.800e-12 174-187 
PR00360B 13.61 5.217e-12 340-354 

PP0036OA 14 50 5 907*»-10 "31 1-394 


105 


PF0016R 


V^Z, UU Ilia 111 piULClIlb. 


PF001 fiRP 97 40 5 500f* 1 8 ^9^-^40 

PF00168B 11.83 2.000e-09 306-317 




RT ill 01^ 


LfAyo ICI (Jl-l/lXlUlIJg JDjUlCliJ xalxxliy 


RT/llOnA 95 14 7 9^1^ 91 558-156 
BL01013B 11 33 1 000e-ll 185-196 


396 


PF00791 


Domain Tire^ent in 70-1 and T Jnc5-1ik'e 
netrin receptors. 


PF00791B 28 49 3 534e-10 52-107 


396 


PD00078 


REPEAT PROTEIN ANK NUCLEAR 
ANKYR. 


PD00078B 13.14 9.000e-ll 173-186 
PD00078B 13 14 3 739e-09 78-91 
PD00078B 13.14 4.130e-09 45-58 


396 


PF00023 


Anlc reneat nrotein^ 


PF00023B 14 20 3 077e-ll 48-58 PF00023B 
14.20 3.769e-ll 176-186 PF00023A 16.03 
7.429e-09 85-101 


397 


PF00023 


Ank repeat proteins. 


PF00023A 16.03 1.750e-10 55-71 


397 


PF00791 


Domain present in ZO-1 and Unc5-like 
netrin receptors. 


PF00791B 28.49 4.455e-ll 55-1 10 ! 
PF00791B 28.49 7.291e-10 88-143 


398 


BL00422 


Granins proteins. 


BL00422C 16.18 5.787e-10 134-162 


400 


PR00450 


RECOVERIN FAMILY SIGNATURE 


PR00450D 16.58 8.986e-ll 161-181 


400 


BL00479 


Phorbol esters / diacylglycerol binding 
domain proteins. 


BL00479B 12.57 4.273e-15 287-303 
BL00479A 19.86 2.667e-14 261-284 

RTn047Q"R 19 57 1 ^£0*» 1ft ^51 *\fH 


400 


PR00171 


CLASS IE CYTOCHROME C 

QTlTTsIATTTPP 
oJLVJIN/\ 1 U JxH 


PR00171D 7.30 9.419e-10 334-342 


400 


BL00018 


FF-hand paleiiirn-hindinty domain 

1—iX iUUlU VCUWllXlxl— tVlllUXUg UvXllOXLi 

proteins. 


BL00018 7 413 348e-09 223-236 


400 


PF00781 


Diacylglycerol kinase catalytic domain 
proteins (presumed). 


PF00781F 16.43 1.000e-40 600-199 
PF00781B 12.07 8.364e-35 454-486 
PF00781D 11.11 3.077e-30 532-118 
PF00781C 9.69 5.034e-19 506-521 
PF00781E 12.45 2.385e-17 124-583 
PF00781G 10.09 6.211e-17 678-692 
PF00781H 12.20 1.750e-16 770-782 
PF00781A 6.42 3.667e-09 354-360 


401 


PR00049 


WHM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 7.407e-09 325-340 


402 


DM01117 


2 kw TRANSPOSASE WITHIN 


DM01117A 11.17 7.750e-09 364-382 
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TRANSPOSITION VASOTOCIN. 




403 


DM01206 


CORONAVIRUS NUCLEOCAPSED 

X^V^A^»^^A ^ *■ X T UVV i ™ \^ V^ArfM^X^^^A 11 kill/ 

PROTEIN. 


DM01206B 10.69 9.286e-12 724-744 
DM01206B 10.69 3.466e-10 726-746 
DM01206B 10.69 9.630e-10 722-742 
DM01206B 10.69 7.152e-09 718-738 
DM01206B 10.69 8.861e-09 728-748 


403 


BL00048 


Protamine PI proteins. 


BL00048 6.39 4.197e-10 722-749 BL00048 
6.39 5.500e-10 731-758 BL00048 6.39 
6.329e-10 729-756 BL00048 6 39 9 171e~10 
730-757 BL00048 639 4.038e-09 728-755 
BL00048 6.39 8.538e-09 724-751 BL00048 
6.39 9.438e-09 716-743 


403 


PD00289 


PROTEIN SH3 DOMAIN REPEAT 
PRESYNA 

X X VJL-/LJ All fit 


PD00289 9.97 9.690e-09 130-144 


404 


PD01066 


PROTEIN ZINC FINGER ZINC- 
FINGER METAL-BINDING NU. 


PD01066 19.43 1.353e-27 31-70 


404 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 5.154e-15 274-287 PD00066 
13.92 7.600e-14 246-259 PD00066 13.92 
8.200e-14 302-315 PD00066 13.92 3.143e- 
12 218-231 PD00066 13.92 4.000e-12 190- 
203 PD00066 13.92 2.800e-09 330-343 


404 


BL00O28 


Xinc finppr tvne Hnmain 

ti*Uw Jiiigvlj VaiImi IV UWj UUUmUU 

proteins. 


BL00028 16 07 7 261e-12 230-247 BIJ00028 
16.07 9.171e-12 342-359 BL00028 16.07 
4.300e-10 314-331 BL00028 16.07 7.000e- 
10 174-191 BL00028 16.07 3.314e-09 202- 
219 BL00028 16.07 6.400e-09 286-303 


404 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 4.214e-13 339-353 
PR00048A 10.52 3.209e-12 227-241 
PR00048A 10 52 1 947e-ll 311-325 
PR00048A 10.52 4.522e-l0 171-185 
PR00048B 6.02 2.895e-09 299-309 
PR00048A 10.52 4.600e-09 199-213 
PR00048B 6.02 1.000e-08 187-197 
PR00048B 6.02 1.000e-08 271-281 


406 


BL00610 


Sodiiunaieurotransmitter symporter 
familv "Dimtein^ 

Wl 111 1 J ^llvLvIJJdi 


BL00610A 17.73 1.000e-40 68-118 
BL00610B 23 65 1 000e-40 132-1 R2 
BL00610C 12.94 1.000e-40 225-277 
BL00610D 20.97 1.000e-40 291-344 
BL00610F 29.02 6.143e-36 540-157 
BL00610E 20.34 3.209e-35 448-491 
BL00610G 12.89 2.200e-15 173-196 


406 


PR00176 


SOD1XM/NEUROTRANSMTTTER 
SYMPORTER SIGNATURE 


PR00176C 10.84 6.226e-23 141-168 
PR00176A 16.82 1.450e-22 68-90 PR00176F 
10.73 8.667e-20 452-472 PR00176B 7.31 
7.000e-18 97-117 PR00176D 9.02 1.000e-17 
252-270 PR00176E 11.41 2.756e-15 334-355 
PR00176H 1527 7.353e-15 131-590 
PR00176G 12.48 5.615e-14 529-112 


407 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 5.304e-09 1 11-121 
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ciiiry mu 
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'Results 


408 


PR00187 


DNAJ PROTEIN FAMILY 

oHJTV/V. I UivCf 


PR00187B 13.48 1.800e-16 45-66 


408 


BL00198 


Nt-dnaJ domain proteins. 


BL00198B 15.11 9.217e-15 45-66 
BL00198A8.072.459e-ll 19-36 


409 


PR00927 


ADENINE NUCLEOTIDE 
TRANSLOCATOR 1 SIGNATURE 


PR00927E 14.93 4. 136e-ll 246-268 


409 


DT AMI c 

BL00215 


Mitochondrial energy transfer proteins. 


BL00215A 15.82 9.735e-14 1 1-36 ! 
BL00215A 15.82 5.787e-U 108-133 
BL00215B 10.44 6.21 le-11 258-271 
BL00215A 15.82 5.018e-09 211-236 


409 


PR00926 


MITOCHONDRIAL CARRIER 
PROTEIN SIGNATURE 


PR00926D 10.53 5.355e-09 19-38 


410 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 6.400e-17 411-424 PD00066 
13.92 8.200e-17 327-340 PD00066 13.92 








5.154e-15 271-284 PD00066 13.92 2.800e- 
14 215-228 PD00066 13.92 9.000e-13 355- 
368 PD00066 13.92 6.143e-12 439-452 
PD00066 13.92 6.478e-ll 187-200 PD00066 
13.92 9.217e-l 1243-256 


410 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 2.588e-14 227-244 BL00028 
16.07 6.824e-14 395-412 BL00028 16.07 
7.882e-14 171-188 BL00028 16.07 2.350e- 
13 339-356 BL00028 16.07 7.300e-13 283- 
300 BL00028 16.07 7.300e-13 367-384 
BL00028 16.07 2.565e-12 423-440 BL00028 
16.07 7.261e-12 199-216 BL00028 16.07 
7.261e-12 311-328 BL00028 16.07 8.435e- 
12 451-468 BL00028 16.07 2.038e-ll 255- 
272 BL00028 16.07 9.400e-10 143-160 


410 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 3.250e-14 280-294 
PR00048A 10.52 8.500e-14 336-350 
PR00048A 10.52 7.429e-13 252-266 
PR00048A 10.52 8.714e-13 448-462 
PR00048A 10.52 9.357e-13 392-406 
PR00048A 10.52 1.000e-12 168-182 

TlT> AAA/1 OA in CO O ACOa 1 O ill A A1A 

PR00048A 10.52 2.059e-12 420-434 

t)T> AAA/1 CD AO O 1 C^ 1 1 /I AO /1 1 O 

FKUUlWoo o.UZ o.ol5e-ll 4Uo-41o 
PR00048B 6 02 7 1 88e-10 268-278 
PR00048B 6.02 7.188e-10 380-390 
PR00048B 6.02 9.438e-10 296-306 
PR00048B 6.02 1.000e-O9 324-334 
PR00048B 6.02 1.474e-09 352-362 
PR00048B 6.02 3.842e-09 212-222 
PR00048B 6.02 5.263e-09 436446 


411 


BL00018 


EF-hand calcium-binding domain 
proteins. 


BL00018 7.41 5.500e-10 63-76 


413 


PR00014 


FIBRONECTIN TYPE III REPEAT 
SIGNATURE 


PR00014C 15.44 4.600e-10 73-92 


414 


PR00806 


VINCULIN SIGNATURE 


PR00806A 6.63 1.493e-09 785-796 


414 


PR00048 


C2H2-TYPE ZINC FINGER 


PR00048A 10.52 4.240e-09 41-55 
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NO: 


entry ID 


Piperr infirm 


i\v9Ull9 






CTGNATt IRF 




414 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 9.546e-l 1 781-796 
PR00049D 0.00 1.205e-10 263-278 
PR00049D 0.00 4.356e-09 785-800 


A"\A 




in e utuiijaju u llii ^vxttx j proiciiis. 




414 


BL00422 


Granins proteins. 


BL00422C 16.18 6.318e-U 439-467 

TVT finAnC 1 1 £ 1 ft Q ftOQc» 1 f\ AAC\-A&% 

BL00422C 16.18 6.294e-09 441-469 


414 


PR00910 


LUTEOVIRUSORF6 PROTEIN 

ollJTN A 1 U JVC 


PR00910A 2.51 8.179e-09 265-278 


414 


DM00215 


PROLINE-RICH PROTEIN 3. 


DM00215 19.43 4.203e-09 770-803 


414 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 1.257e-09 44-61 BL00028 

1/£ fi7 9 ^41o_nQ 17^ 109 T*T AfVi9fi 1£ A7 

lo.u/ zo*tje-uy i/j-ivz dluuuzo io.u/ 
6.143e-09 119-136 BL00028 16.07 9.743e- 

HO 1 Al 1 (\A 

uy i*f/-io*f 


415 


PF00622 


ReceptonDomain in SPla and the 

XV I aJXUUXllC 


PF00622B 21.00 1.000e-13 331-353 

PF0Hfi99P 
rr \j \j uz>Z\v 


415 


BL00518 


Zinc finger, C3HC4 type (RING 

XlXlgOIJ, piUlClIxb. 


BL00518 12.23 3.400e-ll 31-40 


416 


PF00780 


Domain found in NIK1 -like kinases, 
mouse citron and yeast ROM. 


PF00780B 23.03 5.929e-33 442^85 


416 


PR001 09 


DOMAIN SIGNATURE 


PPOfilftOR 19 97 ^ 91S*»-19 91 1 9^0 
ri\uUiU7D / j,/ijjc-iz zi i-zou 


416 


BL00107 

tJXJKJ \i IV// 


Prrttpin IritisiQpQ ATP-KiTiHiricr fpcrinn 
xxuiwiu m irtftco x\ x x -umuxiig z cgii/ii 

rjroteins 


RTil01fi7A 1ft 5 90/W99 91 1-949 
BL00107B 13 31 9 308e-12 283-299 


416 


BL00239 


Receptor tyrosine kinase class II 
Droteins 


BL00239B 25.15 5.164e-10 145-193 


416 


BL00915 


Phospharidylinositol 3- and 4-kinases 

nroteins 


BL00915C 22.43 9.357e-10 203-242 4 


417 


BL00021 


Kringle domain proteins. 


BL00021B 13.33 1.482e-14 41-59 
BL00021D 24 56 2 122e-12 193-235 


417 


PR00722 


CHYMOTRYPSIN SERINE 
PROTEASE FAM1T Y (SI 1 

X XVV/ X XJT\.ijU X /VLVXXXv X X 1 

SIGNATURE 


PR00722A 12.27 7.517e-14 42-58 
PR00722B 12 51 3 I43e-10Q7.112 


417 


BL00134 


Serine proteases, trypsin family, 
histidine proteins. 


BL00134A 11.96 6.464e-16 41-58 
BL00134C 13.45 2.059e-09 221-235 


417 


BL00495 


Apple domain proteins. 


BL00495O 13.75 2.440e-09 212-241 


417 


BL00672 


Serine proteases, V8 family, histidine 
proteins. 


BL00672A 9.79 9.520e-09 41-57 


417 


PR00839 


V8 SERINE PROTEASE FAMILY 
SIGNATURE 


PR00839B 11.20 9.753e-09 41-59 


418 


BL01207 


Glypicans proteins. 


BL01207B 23.69 9.122e-28 191-237 j 
BL01207A 12.21 1.000e-16 62-78 


423 


PD02870 


RECEPTOR 1NTER1JEUKIN-1 
PRECURSOR. 


PD02870D 15.74 4.351e-09 693-728 


423 


DM00179 


w KINASE ALPHA ADHESION T- 
CELL. 


DM00179 13.97 5.696e-09 793-803 
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entry ID 


Tlpcrrinfinn 


* Results 


424 


BL00203 


Vertebrate metallothioneins proteins. 


BL00203 13.94 5.04U-09 13-59 


AOS 


DLUUlv / 


Pi*r*t<»in l^inQCPc A TP-VtinHin cr t-p o~J c\v\ 
t lUlClH AJUabCS /V 1 JT -DiUUllig ICglULl 

proteins. 


RT00107A 18 39 8 I41e-18 217-248 


425 


BL00240 


Receptor tyrosine kinase class IE 
proteins. 


BL00240E 11.56 6.040e-10 203-241 


4Z3 


pp nn 1 no 


TVRHQTMP KTMAQT7 PAT AT VTTP 
1 I JvL/olTNJi JSaXNAoH LA1ALI 111-* 

DOMAIN SIGNATURE 


PPAA1AQP* 1 9 97 5 R1A<» 14 917 916 
xxNAJUIUSJxj \.L.Ll J.OlHC-l^T *i /-Z.JO 

PR00109A 15.00 1.730e-09 182-196 


428 


IVDAA1 A 1 

rKUU141 


PPfYTCJ a Qrw/rc pr'iAyfDr^ximvTT 
rKU 1 JtlAoUJVLii CUMrUJNiifN i 

SIGNATURE 


PPAA1 A1P 1 1 K A 111** 19 91A_9A£ 
rl\.UU141V* 1 1.13 OO-JOe-lZ Z3H-Z10 

PR00141D 12.45 8.615e-12 259-271 

PPAA1A1P 11 ICQ C/:i 0 10 991 915 
rxvUUl41xJ 11.13 iOOie-lZ ZZ3-Z33 

PR00141A 11.362.050e-ll 102-118 


4zo 




Proteasome B-type subumts proteins. 


DT AAfcCAA 11 Ol 1 lfll*» 1O00 

BL00854C 29.92 5.235e-14 206-235 

RT AAOCAf\ 1 1 If, 9 Rfinp HO 957 9/\7 


429 


PR00245 


OLFACTORY RECEPTOR 

olOiN A 1 U IxH 


PR00245A 18.03 9.413e-17 59-81 

PPO.fi9A*P 7C17 5AYW16 91R fc 954 

PR00245E 12.40 2.500e-12 291-306 
PR00245B 10 38 9 1 12e-l 1 177-192 

± S\\i\J£*T-JlJ 1 w.JO /ill*iC XI X / 1 ISA 


429 


PR00237 


RHODOPSIN-LIKE GPCR 
ST IPFRFAMTT Y STONAT1 IRE 


PR00237E 13.03 7.120e-12 199-223 
PR00237C 15 69 1 225e-09 104-127 


429 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 9.727e-14 90-130 
BI 0023 7D 11 23 1 273e-09 282-299 


429 


PR00534 


MELANOCORTIN RECEPTOR 
FAMILY SIGNATURE 


PR00534A 11.49 6.400e-09 51-64 


430 


PF00651 


BTB (also known as BR-C/Ttk) domain 


PF00651 15.00 1.000e-ll 87-100 


430 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 4.706e-14 474^91 BL00028 
16.07 1.77U-09 502-519 


*tjw 




PROTFTN 7TNC-FTNOFR MFTAI - 
BINDI. 


PD00066 13 92 4 300e-09 490-503 


430 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 4.600e-09 499-513 


433 


BL00086 


Cytochrome P450 cysteine heme-iron 
ligand proteins. 


BL00086 20.87 3.209e-23 430-462 


All 
433 


PPAAA£5 
JrKUU403 


p aqc pvi^a nx> nj rp tv 

OlVJINAl UxVC 


PPAAA£^I? 11 17 1 1£Aa 1 1 AHA/HQ 
xxvUU403r 13.3/ 1.30we-ll *t\JU-*riy 


433 


PR00359 


B-CLASS P450 SIGNATURE 


PR00359G 11.22 8.071e-10 401-417 
PR00359F 24.20 2.180e-09 373-401 


433 


PR00385 


P450 SUPERFAMILY SIGNATURE 


PR00385E 12.66 8.800e-ll 440-452 
PR00385D 13.11 4.429e-l0 431-441 
PR00385A 14.97 5.865e-09 302-320 


433 


PR00464 


E-CLASS P450 GROUP H 
SIGNATURE 


PR00464G 12.41 9.000e-10 405-421 
PR00464D 17.40 1.191e-09 320-338 
PR00464E 18.28 6.946e-09 349-370 
PR00464H 13.32 7.750e-09 427-441 
PR00464C 18.84 9.014e-09 291-320 
PR004641 14.64 9.481e-09 440-464 


434 


BL00216 


Sugar transport proteins. 


BL00216B 27.64 7.943e-l9 101-151 


434 


PR00171 


SUGAR TRANSPORTER 
SIGNATURE 


PR00171D 12.76 3.593e-ll 413^435 
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^Results 


435 


PR00049 


WILM'S TUMOUR PROTEIN 
SIGNATURE 


PR00049D 0.00 8.429e-10 10-25 


435 


BL00028 


Zinc finger, C2H2 type, domain 
proteins. 


BL00028 16.07 4.150e-13 138-593 BL00028 
16.07 6.850e-13 1010-1027 BL00028 16.07 
6.087e-12 982-999 BL00028 16.07 8.615e- 
11 846-863 BL00028 16.07 3.100e-10 317- 
334 BL00028 16.07 7.000e-l0 170-187 
BL00028 16.07 8.500e-10 289-306 BL00028 
16.07 8.800e-10 548-565 


435 


PD00066 


PROTEIN ZINC-FINGER METAL- 
BINDL 


PD00066 13.92 7.600e-14 998-1011 
PD00066 13.92 1.000e-ll 305-318 PD00066 
13.92 8.826e-ll 564-577 PD00066 13.92 
3.400e-09 862-875 


435 


Jt^KUU45o 


KiiiUbUMAL rRU 1 HUN rl 
SIGNATURE 


T)T> firV/K/TE? 1 A/C C T)Q n AO 1*7*7 1Q*> 

j^KUU4oab j.uo o.jzye-ui* 1 / /-iyz 
PR00456E 3.06 5.899e-09 140-155 






Cfratsf mac mintilici<n lima 1*1 i"iiTvi4'/\* , f« 

ouopioinyces suDuusin--iypc lnniuiiors 
pro ic ins. 




435 


PR00048 


C2H2-TYPE ZINC FINGER 
SIGNATURE 


PR00048A 10.52 9.357e-13 573-587 
PR00048A 10.52 2.421e-ll 1007-1021 
PR00048B 6.02 2.125e-10 561-133 
PR00048A 10.52 8.043e-10 314-328 
PR00048B 6.02 1.000e-09 995-1005 
PR00048B 6.02 6.684e-09 302-312 
PR00048A 10.52 9.280e-09 167-181 


436 


PR00245 


OLFACTORY RECEPTOR 
SIGNATURE 


PR00245A 18.03 2.667e-23 100-122 
PR00245C 7.84 1.783e-14 232-248 
PR00245D 10.47 7.070e-10 268-280 


436 


PR00237 


RHODOPSIN-LIKE GPCR 


PR00237C 15.69 8.500e-ll 145-168 






SUPERFAMILY SIGNATURE 


PR00237G 19.63 6.023e-09 266-293 


436 


BL00237 


G-protein coupled receptors proteins. 


BL00237A 27.68 2.161e-15 131-171 
BL00237D 11.23 8.091e-09 276-293 


437 


PR00262 


EL1/HBGF FAMILY SIGNATURE 


PR00262A 28.26 1.000e-08 80-108 


438 


BL00884 


Osteopontin proteins. 


BL00884B 12.47 1.000e-40 50-94 
BL00884C 22.45 6.187e-39 131-173 
BL00884A 11.35 5.846e-32 1-31 BL00884E 
11.04 8.364e-23 273-295 BL00884D8.79 
3.323e-18 255-272 


438 


PR00216 


OSTEOPONTIN SIGNATURE 


PR00216B 7.89 4.553e-34 37-67 PR00216A 
10.94 8.054e-33 2-32 PR00216C 9.63 
2.565e-32 67-93 PR002160 12.39 8.676e-27 
238-264 PR00216H 7.41 5.295e-22 273-293 
PR00216F 11.79 3.133e-21 164-183 
PR00216D 2.74 5.800e-18 104-119 
PR00216E 8.44 4.405e-16 132-147 
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SEQ 
ID 


Ffam Model 


Description 


E-value 


Score 


No: of 

Pfam 

Domains 


Position of 
the Domain 


1 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3- 
H type 


1.8e-05 


31.6 


I 


412-438 


1 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


2e-05 


21.8 


1 


14-52 


3 


EMP24_GP25L 


emp24/gp25L/p24 family 


4.1e-105 


362.6 


1 


22-235 


6 


WW 


WW domain 


l>2e-05 


32.2 


1 


45-75 


7 


WW 


WW domain 


1.2e-05 


32.2 


1 


45-75 


8 


Aajrans 


Transmembrane amino acid 
transporter protein 


9.6e-64 


225.2 


1 


71-451 


9 


Fe-ADH 


fron-containing alcohol 
dehydrogenase 


9.9e-35 


124.5 . 


2 


4-205:228- 
255 


10 


Fe-ADH 


Iron-containing alcohol 
dehydrogenase 


9.9e-35 


124.5 


2 


52-253:276- 
303 


11 


Bcl-2 


Apoptosis regulator proteins, 
Bcl-2 family 


0.016 


-2.1 


1 


257-356 


12 


spectrin 


Spectrin repeat 


1.3e-10 


43.6 


3 


11-87:90- 
197:200-291 


13 


Ribosomal_L18ae 


Ribosomal L18ae protein 
family 


1.9e-128 


440.1 


1 


6-176 


14 


Ribosomal_L31e 


Ribosomal protein L31e 


2.4e-47 


170.7 


1 


72-166 


15 


zf-CCCH 


Zinc finger C-x8-C-x5-C-x3- 
Htype 


7.8e-16 


66.0 




342-367:371- 
396:398-420 


16 


zf-MYND 


MYND finger 


1.4e-13 


58.5 


1 


52-90 


17 


Sterile 


Male sterility protein ; 


Lle-51 


185.1 




254-446 


18 


MgtE 


Divalent cation transporter 


8.6e-39 


142.3 


2 


138-274:352- 
499 


19 


Rap_GAP 


Rap/ran-GAP 


2e-124 


426.7 




400-588 


19 


PDZ 


PDZ domain (Also known as 
DHR or GLGF) 


2.4e-06 


34.5 


1 


726-800 


20 


Rap_GAP 


Rap/ran-GAP 


2e-124 


426.7 


1 


400-588 


20 


PDZ 


PDZ domain (Also known as 
DHR or GLGF) 


2.4e-06 


34.5 




726-800 


22 


SCAN 


SCAN domain 


1.5e-23 


91.7 


1 


165-238 


23 


RhoGAP 


RhoGAP domain 


3e-58 


206.9 


\ . 


497-649 


23 


FCH 


Fes/CIP4 homology domain 


1.2e-18 


75.4 




22-121 


23 


SH3 


SH3 domain 


2.6e-ll 


51.0 


1 


723-777 


OA 


adh_zinc 


Zinc-binding dehydrogenases 


1.5e-05 


-25.4 




20-336 


25 


UDPGT 


UDP-glucoronosyl and UDP- 
glucosyl transferas 


1.6e-84 


294.3 




26-467 


28 


Ribosomal JL6e 


Ribosomal protein L6e 


4.3e-77 


269.5 




109-239 


29 


RibosomaLLH 


Ribosomal protein LI 1 


4.9e-64 


226.2 




13-144 


30 


tRNA-synt_le 


tRNA synthetases class I (C) 


1.6e-137 


470.2 




64-538 


32 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.00041 


17.6 




33-72:165- 
185 


34 


ras 


Ras family 


1.4e-77 


271.2 




35-235 


34 


arf 


ADP-ribosyiation factor 
family 


9.3e-05 


-56.3 




17-198 


36 


SET 


SET domain 


3.2e-05 


10.0 


1 


209-342 


36 


MORN 


MORN repeat 


0.006 


23.2 


3 


36-58:59- 
81:106-128 
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No: of 

Pfam 

Domains 


Position of 
the Domain 


37 


laminin__G 


Laminin G domain 


1.5e-ll 


44.7 




55-174 


37 


EGF 


EGF-like domain 


0.0033 


24.1 


1 


202-234 


38 


Sema 


Sema domain 


L7e-127 


436.9 


1 


56-489 


38 


Plexin_repeat 


Plexin repeat 


le-06 


35.7 




507-563 


38 


ig 


Immunoglobulin domain 


0.0023 


15.9 




582-639 


38 


integrin^B 


Integrins, beta chain 


0.084 


6.1 


J 


513-527 


40 


filament 


Intermediate filament protein 


1.6e-138 


473.6 




129-442 


41 


KeratiiLB2 


Keratin, high sulfur B2 
protein 


L8e-18 


74.8 


2 


2-138:139- 
240 


44 


sushi 


Sushi domain (SCR repeat) 


3,8e-06 


33.9 


4 


1396- 

1459:1464- 
1521:1525- 
1590:1595- 
1646 


45 


profilin 


Profilin 


4.1e-13 


51.7 


1 


10-124 


47 


ubiquitin 


Ubiquitin family 


0.00033 


20.5 


1 


31-99 


48 


BTB 


BTB/POZ domain 


2.6e-21 


84.2 


1 


80-196 


48 


Kelch 


Kelch motif 


2.6e-20 


80.9 


4 


336-382:384- 

430:432- 

478*582-635 


48 


SCP 


SCP-like extracellular protein 


0.015 


13.0 


1 


1-35 


49 


serpin 


Serpin (serine protease 
inhibitor) 


2.4e-178 


605.4 


1 


59-432 


50 


T-box 


T-box 


3.6e-125 


429.2 


1 


140-331 


52 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


1.2e-17 


58.3 


2 


132-228:337- 
344 


53 


CSD 


<Cold-shock' DNA-binding 
domain 


1.8e-16 


63.6 


1 


42-112 


53 


zf-CCHC 


Zinc knuckle 


0.0Q012 


28.8 


2 


137-154:159- 
176 


54 




Immunoglobulin domain 


2.5e-07 


28.7 


1 


34-109 


55 


Rap_GAP 


Rap/ran-GAP 


5e-18 


73.3 


1 


287-466 


57 


G-gamma 


GGL domain 


1.8e-ll 


39.4 


2 


49-70:109- 


58 


T-box 


T-box 


8,9e-114 


391.4 


1 


101-302 


59 


Gag_pl0 


Retroviral GAG plO protein 


9.2e-06 


23.7 


1 


82-171 


61 


60sjribosomal 


60s Acidic ribosomal protein 


0.0089 


12.0 


1 


1-22 


62 


UPARJLY6 


u-PAR/Ly-6 domain 


5.4e-05 


22.3 


1 


8-51 


63 


Ribosomal J.30 


Ribosomal protein L30p/L7e 


0.00042 


18.5 


1 


65-93 


64 


filament 


Intermediate filament protein 


l.le-78 


274.8 


2 


161-338:339- 
426 


65 


Ribosomal_S6 


Ribosomal protein S6 


0.00082 


7.5 


1 


2-96 


66 


PDZ 


PDZ domain (Also known as 
DHRorGLGF) 


5.1e-09 


43.4 


1 


158-250 


67 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.005 


14.0 


1 


92-118 


68 


G-patch 


G-patch domain 


6.8e-07 


36.3 


1 


26-70 


69 


Keratin_B2 


Keratin, high sulfur B2 
protein 


0.037 


-45.9 


1 


10-155 


83 




Immunoglobulin domain 


83e-09 


33.4 


2 


34-89:119- 
187 


86 


zf-C2H2 


Zinc finger, C2H2 type 


2.2e-71 


250.6 


17 


182-204:210- 
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232:237- 
260:265- 
288:315- 
337:343- 
365:369- 
392:653- 
675:681- 
704:709- 
733:741- 
764:791- 
814:820- 
842:848- 
870:877- 
899:905- 
928:952-975 


87 


ig 


Immunoglobulin domain 


2.7e-35 


118.7 


6 


36-121:162- 

249:292- 

375:422- 

517:564- 

657:704-795 


88 


MAP1JLC3 


Microtubule associated 
protein 1A/1B, light 


9.4e-79 


275.0 


1 


118-221 


89 


WD40 


WD domain, G-beta repeat 


1.6e-12 


55.1 


4 


173-215:221- 
263:269- 
305:1103- 
1140 


90 


FKBP 


FKBP-type peptidyl-prolyl 
cis-trans isomeras 


1.2e-59 


198.9 


1 ; 


66-160 


92 


RPEL 


RPEL repeat 


6.5e-18 


73.0 


2 


513-538:551- 
576 


93 


transket_pyr 


Transketolase, pyridine 
binding domain 


4.6e-65 


229.6 


1 


568-773 


93 


El_dehydrog 


Dehydrogenase El 
component 


8.7e-23 


89.1 


1 


193-504 


95 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


8.7e-09 


32.7 . 


1 


595-635 


97 


ig 


Immunoglobulin domain 


1.8e-20 


71.0 


3 


31-88:127- 
185:222-278 


98 


ig 


Immunoglobulin domain 


1.8e-20 


71.0 


3 


24-81:120- 
178:215-271 


99 


Patched 


Patched family 


6.2e-06 


-369.1 


1 


66-935 


102 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-94 


326.9 


12 


209-231:237- 

259:265- 

287:293- 

315:321- 

343:349- 

371:377- 

399:405- 

427:433- 

455:461- 

483:489- 

511:594-616 
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102 


KRAB 


KRAB box 


3.7e-37 


136.9 


1 


15-77 


103 


zf-C2H2 


Zinc finger, C2H2 type 


1.2e-55 


198.2 


9 


172-195:271- 

293:299- 

321:327- 

349:355- 

377:383- 

405:411- 

433:439- 

461:467-489 


103 


KRAB 


KRAB box 


3e-46 


167.1 


1 


8-70 


107 


zf-CCHC 


Zinc knuckle 


2.4e-16 


67.8 


3 


913- 

930:1293- 

1310:1358- 

1375 


107 


NTPjransfJZ 


Nucleotidyltransferase 
domain 


4.4e-ll 


50.3 


1 


972-1065 


108 


zf-C2H2 


Zinc finger, C2H2 type 


1.6e-42 


154.7 


5 


283:289- 
311:317- 
339:345- 
367:373-395 


109 


myosinjhead 


Myosin head (motor domain) 


0 


1267.5 


1 


26-697 


109 


IQ 


IQ calmodulin-binding motif 


1.2e-17 


72.1 


4 


714-734:737- 

757:760- 

780:789-809 


110 


pkinase 


Protein kinase domain 


1.2e-96 


334.5 


1 


20-271 


111 


WD40 


WD domain, G-beta repeat 


1.8e^9 


177.8 


8 


161-197:218- 

253:258- 

294:300- 

335:341- 

377:383- 

428:434- 

470:476-511 


112 


SNF2JN 


SNF2 and others N-tenninal 
domain 


4.2e-78 


272.9 


1 


1-264 


112 


helicase_C 


Helicase conserved C- 
terminal domain 


1.2e-24 


95.4 


1 


326-410 


113 


DUF15 


Domain of unknown function 
DUF15 


0.00064 


-60.4 


1 


132-384 


114 


DSPc 


Dual specificity phosphatase, 
catalytic 


0.0004 


-2.9 


1 


141-295 


114 


Y_phosphatase 


Protein-tyrosine phosphatase 


0.0037 


-26.9 




128-295 


115 


Ulpl_C 


Ulpl protease family, C- 
terminal catalytic d 


2.8e-52 


187.1 




394-587 


117 


Rhodanese 


Rhodanese-like domain 


le-05 


32.4 




160-260 


119 


ABC1 


ABC1 family 


1.7e-40 


147.9 




318-434 


122 


proteasome 


Proteasome A-type and B- 
type 


7.4e-43 


155.8 




39-146 


124 


Ribosomal_L9 


Ribosomal protein L9 


3.1e-05 


-3.4 




94-240 


125 


RIOl 


RIO1/ZK632.3/MJ0444 
family 


7.8e-80 


278.6 




193-387 


128 


abhydrolase 


alpha/beta hydrolase fold 


4.5e-20 


80.1 


1 


121-364 


129 


TPR 


TPR Domain 


4.8e-27 


103.3 


7 


355-388:473- 
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506:507- 
540:654- 

/AM /A A 

687:688- 
721:722- 
755:756-789 




iliVlLJl*f_i / 


HM(j14 and JtiMCjl/ 


1.9e-15 


64.7 


1 


2-73 


131 


bZTP 


bZIP transcription 


8.3e-19 


71.7 


1 


288-352 


132 


mn 


RNA recognition motif. 


1.9e-31 


117.9 


3 


432-502:546- 
616:858-929 


133 


AMP-binding 


AMP-binding enzyme 


7.1e-117 


401.7 


1 


142-580 


138 


tubulin 


Tubulin/FtsZ family 


2.1e-151 


516.4 


1 


1-223 


141 


lamininJEGF 


Laminin EGF-iike (Domains 
ID and V) 


7.6e-12 


52.8 


4 


252-297:300- 
348:1342- 
1391:1469- 
1530 


141 


Kelch 


Kelch motif 


1.6e-05 


31.8 


4 


654-702:760- 
811:873- . 
918:929-990 


141 


integrinJB 


Integrins, beta chain 


0.0061 


9.4 


3 


44-59:100- 

117:1019- 

1028 


141 


EGF 


EGF-like domain 


0.092 


19.3 


8 


167-203:207- 

235:297- 

331:496- 

533:538- 

569:1271- 

1308:1312- 

1338:1478- 

1508 


142 


RUN 


RUN domain 


8e-44 


159.0 


1 


31-163 




T7WT7 


i vli zinc linger 


2.3e-29 


109.1 . 


1 


529-593 


143 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-33 


124.7 


5 


442-464:505- 
527:533- 
555:561- 
583:589-611 


143 


BTB 


BTB/POZ domain 


1.6e-22 


88.2 


1 


30-143 


1 A A 

144 


mito_caiT 


Mitochondrial carrier protein 


3.6e-61 


216.6 


3 


10-158:160- 
250:254-354 


140 


JL/AVJKC 


Diacylglycerol kinase 
catalvtic domain 


0.00015 


26.0 


1 


157-303 


147 


Exonuclease 


Exonuclease 


L6e-41 


151.4 


1 


228-384 


147 


mo. 


RNA recognition motif. 


9.5e-08 


39.2 


2 


507-574:602- 
674 


151 


WH2 


WH2 motif 


6.5e-20 


79.6 


3 


1194- 
1214:1234- 
1254:1322- 
1342 


154 


DHDPS 


Dihydrodipicolinate 
synthetase family 


9.1e-21 


82.4 


1 


3-270 


156 


PseudoU_synth_l 


tRNA pseudouridine synthase 


le-30 


115.4 


1 


111-322 


157 ! 


pkinase 


Protein kinase domain 


2.3e-59 


210.6 


1 


216-512 


158 


ubiquitin 


Ubiquitin family 


2.4e-05 


24.6 


1 


3-79 
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Ir-ZrS 


Initiation factor 2 subumt 
family 


i./e-yo 


340./ 


1 


157-475 


161 


Beach 


Beige/BEACH domain 


l.le-224 


759.8 


1 


1470-1747 


161 


WD40 


WD domain, G-beta repeat 


2.9e-08 


40.9 


5 


1848- 

1882:1888- 
1928:1947- 
1983:2030- 
2064:2071- 
2107 


164 


DnaJ 


DnaJ domain 


1.9e-16 


68.1 


1 


125-189 


165 


AntLproliferat 


BTG1 family 


7.4e-85 


295.3 


1 


11-164 


166 


-sugar Jr 


Sugar (and other) transporter 


L2e-78 


274.7 


1 


34-548 


167 


sugar jtr 


Sugar (and other) transporter 


7e-52 


185.8 


1 


34-480 


168 


zf-C2H2 


Zinc finger, C2H2 type 


1.7e-93 


324.0 


13 


222-244:250- 

272:278- 

300:306- 

328:334- 

356:362- 

384:390- 

412:418- 

440:446- 

468:474- 

496:502- 

524:530- 

rco.effQ con 

552:55 8-5 oU 


loo 


KKAd 


KKAd OOX 


l.oe-35 


131.Z 


l 


5/-uy 




OJ3r 


Guanylate-binding protein, ■ 
N-terminal domain 


le-iyi 


oJo.Z 


l 


1 T7C 

l-Z/5 


ioy 


add 


Guanylate-binding protein, C- 
tenninal domain 


o.oe- loz 




i 


0*7*7 C.*J*a 


I/O 


cyclin 


Cyclin, N-terminal domain 


U.UU2Z 


O 1 

y.3 


1 


4o-iy2 


1*71 

1/1 


TPR 


TDD ^<UMA<M 

lrKJJomain 


O Ha, A1 


15 j. 4 


0 


133-100:10/- 
o/vvoni 

Olil.OOO 

Z^4.ZoZ- 
315:316- 


17^ 
i to 


cuiuvjcr 


XvaOLJCF aoiuaiu 






1 
1 




173 




rn. uuuitun 






1 

1 


^ / O-HOj 


173 


SH3 


SID domain 


l.le-10 


48.9 


1 


72-126 


174 


zf-C3HC4 


Zinc tinker C3HC4 tvne 
(RING finger) 


000011 


19.4 


1 


18-55 


174 


GBP_C 


Guanylate-binding protein, C- 
terminal domain 


0.016 


12.1 


1 


86-114 


175 


Peptidase_M22 


Glycoprotease family 


2.3e-73 


257.2 


1 


1-324 


177 


TBC 


TBC domain 


4.7e-08 


10.1 


1 


57-268 


178 


transmembrane4 


Tetraspanin family 


1.6e-78 


259.2 


1 


16-261 


179 


CH 


Calponin homology (CH) 
domain 


l.2e-25 


98.6 


1 


24-133 


179 


calponin 


Calponin family repeat 


1.7e-14 


51.8 


1 


174-199 


182 


AP_endonucleasl 


AP endonuclease family 1 


2.6e-17 


59.4 


2 


1-36:50-135 


184 


BacteriaLPQQ 


PQQ enzyme repeat 


9.3e-05 


29.2 


2 


52-89:534- 
571 
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Domains 




185 


DEAD 


DEAD/DE AH box hehcase 


1.6e-60 


194.3 


1 


216-420 


1 DC 

185 


helicase_C 


Helicase conserved C- 


5.9e-25 


96.3 


1 


454-540 






terminal domain 










186 


zf-C2H2 


Zinc finger, C2H2 type 


3.2e-24 


93.9 


6 


106-128: 134r 














156:162- 














184:195- 














218:477- 














499:505-529 


187 


sugar_jr 


Sugar (and other) transporter 


0.0014 


-90.1 


1 


272-672 


1 oo 

188 


tRNAjuitjendo 


tRNA intron endonuclease, 


0.0025 


-7.7 


1 


73-159 






catalytic C-t 










189 


wsc 


WSC domain 


le-35 


132.1 


1 


175-254 


189 


Sulfotransfer 


Sulfotransferase protein 


4e-34 


126.8 


1 


356-586 


191 


pkinase 


Protein kinase domain 


5.1e-75 


262.6 


1 


148-421 


191 


PDZ 


PDZ domain (Also known as 


1.3e-05 


32.1 


1 


740-827 






DHRorGLGF) 










193 


globin 


Globin 


1.9e-26 


96.6 


1 


3-78 


195 


WD40 


WD domain, G-beta repeat 


6.7e-14 


59.6 


4 


64-108:116- 














153:158- 














194:288-323 


197 


BROl 


BROl-like domain 


0.0042 


-29.4 


1 


9-161 


198 


F_actiDLcap_B 


F-actin capping protein, beta 


1.7e-224 


759.2 


1 


1-269 






subunit 










199 


auk 


Ank repeat 


le-66 


235.0 


8 


40-73:82- 






* 








114:115- 














147:148- 














180:181- 














212:213- 














246:481- 














526:527-559 


203 


PDZ 


PDZ domain (Also known as 


4.2e-07 


37.0 


1 


211-293 






DHR or GLGF) 










204 


SAM 


SAM domain (Sterile alpha 


1.2e-ll 


52.1 


1 


5-70 






motif) 










205 


SAM 


SAM domain (Sterile alpha 


1.2e-ll 


52.1 


1 


5-70 






motif) 










206 


zf-UBRl 


Putative zinc finger in N- 


4.7e-25 


96.7 


1 


978-1046 






recognin 










207 


ABC_tran 


ABC transporter 


2.4e-H2 


386.6 


2 


467- 














04/:lD30- 














1717 


209 


zfC2H2 


Zinc finger, C2H2 type 


0.00035 


27.3 


1 


200-225 


210 


UCH-2 


Ubiquitin carboxyl-terminal 


L5e-19 


78.4 


1 


385-454 






hydrolase family 










211 


IMP4 


Domain of unknown function 


2.2e-33 


124.3 


1 


144-297 


213 


zf-C2H2 


Zinc finger, C2H2 type 


2.9e-08 


40.9 


3 


12-37:173- 












198:208-230 


214 


LysM 


LysM domain 


2.1e-ll 


51.3 


1 


73-116 


215 


auk 


Ank repeat 


l.le-05 


32.3 


2 


834-867:879- 














912 


215 


TIG 


IPT/TIG domain 


0.009 


22.6 


1 


642-723 


217 


pyr_redox 


Pyridine nucleotide- 


1.7e-71 


251.0 


I 


196-470 
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disulphide oxidoreducta 










217 


Rieske 


Rieske [2Fe-2S] domain 


6.2e-20 


79.6 


1 


68-168 


218 


PDZ 


PDZ domain (Also known as 
DHR or GLGF) 


8.5e-19 


75.9 


1 


642-728 


219 


pkinase 


Protein kinase domain 


8.1e-67 


235.4 


1 


26-204 


220 


dsrm 


Double-stranded RNA 
binding motif 


0.095 


15 


1 


100-172 


221 


PHD 


PHD-finger 


5.4e-05 


29.6 


1 


147-203 


222 


L27 


L27 domain 


6.5e-16 


66.3 


1 


13-68 


222 


SAM 


SAM domain (Sterile alpha 
motif) 


7.2e-10 


46.2 


2 


1051- 

1117:1166- 
1230 


223 


TRM 


N2,N2-dimethylguanosine 
tRNA methyltransfera 


*7.3e-22 


86.1 


1 


227-693 


224 


LIM 


LIM domain 


5.3e-06 


33.4 


2 


124-180:183- 
243 


225 




Immunoglobulin domain 


l.le-07 


29.8 


1 


55-144 


227 


F-box 


F-box domain 


1.3e-05 


32.1 


1 


11-59 


229 


Glucosamine^iso 


Glucosamine-6-phosphate 
isomerases/6- 


2.7e-158 


539.3 


1 


15-250 


231 


PTN_MK 


PTN/MK heparm-binding 
protein family 


3.6e-44 


160.2 


1 


51-148 


236 


ion_trans 


Ion transport protein 


1.6e-22 


88.3 


1 


174-393 


238 


GNS1_SUR4 


GNS1/SUR4 family 


5.2e-46 


166.3 


1 


10-265 


240 


ubiquitin 


Ubiquitin family 


2.7e-05 


24.4 


1 


10-89 


241 


PIP5K 


Phosphatidylinositol-4- 
phosphate 5-Kinase 


1.5e-155 


530.2 


1 


124-420 


242 


cadherin 


Cadherin domain 


0 


1298.9 


19 


1-75:89- 

180:194- 

290:355- 

434:448- 

549:563- 

652:671- 

774:788- 

881:896- 

988:1002- 

1092:1106- 

1192:1206- 

1295:1309- 

1379:1393- 

1489:1503- 

1594:1608- 

1699:1713- 

1808:1814- 

1910:1922- 

2016 


244 


fh3 


Fibronectin type III domain 


1.2e-31 


118.6 


4 


58-140:152- 

238:249- 

333:345-426 


245 


UCLcon 


Ubiqmtin-conjugating 
enzyme 


1.4e-16 


68.5 


1 


93-250 


246 


LRR 


Leucine Rich Repeat 


1.7e-14 


61.6 


6 


51-75:76- 
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* 


99:155- 
178:181- 
203:204- 
226:227-251 


247 


lipocalin 


Lipocalin / cytosolic fatty- 
acid binding pr 


1.2e-28 


102.8 


1 


164-294 


248 


RibosomaLS2 


Ribosomal protein S2 


2.9e-ll 


43.7 


1 


33-80 


249 


tubulin 


Tubulin/FtsZ family 


8.5e-163 


554.2 


1 


1-277 


250 


tubulin 


TubulinVFtsZ family 


2.4e-212 


718.8 


1 


1-351 


251 


ATP-synt_ab 


ATP synthase alpha/beta 
family, nucleot 


1.2e-75 


264.8 


1 


138-346 


251 


ATP-synt_ab_C 


ATP synthase alpha/beta 
chain, C termin 


2.7e-38 


140.6 


1 


348-456 


251 


ATP-synt_ab_N 


ATP synthase alpha/beta 
family, beta-ba 


5.4e-19 


76.5 


1 


67-135 


252 


ATP-synt_ab 


ATP synthase alpha/beta 
family, nucleot 


L3e-70 


248.0 


1 


138-344 


252 


ATP-synt_ab_N 


ATP synthase alpha/beta 
family, beta-ba 


5.4e-19 


763 


1 


67-135 


253 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


5e-12 


43.2 


1 


39-79 


254 


G-patch 


G-patch domain 


1.3e-08 


42.1 


1 


410-456 


255 


CH 


Calponin homology (CH) 
domain 


1.6e-ll 


51.7 


1 


24-134 


256 


RF-1 


Peptidyl-tRNA hydrolase 
domain 


5.9e-66 


232.5 | 


1 


225-338 


257 


RF-1 


Peptidyl-tRNA hydrolase 
domain 


5.9e-66 


232.5 


1 


189-302 


258 


OTU 


OTU-Iike cysteine protease 


4.4e-18 


735 


1 


189-304 


259 


thiored 


Thioredoxin 


2e-09 


35.7 , 


2 


119-165:662- 
695 


260 


thyroglobulin_l 


Thyrogiobulin type-1 repeat 


3.1e-34 


127.2 


2 


95-158:227- 
292 


260 


kazal 


Kazal- type serine protease 
inhibitor 


9.3e-07 


35.9 


1 


43-87 


262 


DnaJ 


DnaJ domain 


4.1e-15 


63.6 


1 


277-338 


263 


WD40 


WD domain, G-beta repeat 


4e-21 


83.6 


5 


3-42:49- 
86:97- 
133:142- 
178:184-220 


265 


DUF6 


Integral membrane protein 
DUF6 


0.083 


9.1 


2 


81-316:338- 
470 


266 


RibosomaLL31e 


Ribosomal protein L31e 


1.7e-61 


217.7 


1 


15-109 


268 


F5JF8 type C 


F5/8 type C domain 


2.4e-65 


230.5 


1 


42-196 


268 


Zn_carbOpept 


Zinc carboxypeptidase 


3.5e-50 


180.1 


2 


224-341:400- 
600 


270 


BTB 


BTB/POZ domain 


7.7e-18 


72.7 


1 


8-119 


270 


zf-C2H2 


Zinc finger, C2H2 type 


4.2e-13 


57.0 


4 


254-276:363- 

385:390- 

412:448-468 


271 


Glycos transf 1 


Glycosyl transferases group 1 


0.027 


12.8 


1 


291-385 


272 


HEAT 


HEAT repeat 


2.2e-07 


38.0 


3 


237-275:276- 
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315:674-712 


273 


HEAT 


HEAT repeat 


2.2e-07 


38.0 


3 


237-275:276- 
315:640-678 


275 


SPRY 


SPRY domain 


2.6e-34 


127.4 


1 


390-515 


275 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


le-16 


58.5 


1 


29-69 


277 


BTB 


BTB/POZ domain 


6e-27 


103.0 


1 


36-149 


277 


Kelch 


Kelch motif 


9.7e-21 


82.3 


4 


331-390:392- 
441:443- 

Hyj ,J*f\r'D oO 


27 o 


Zt-C2riZ 


Ziinc linger, czriz type 


4.ie-no 


1QO 0 


Izt 


243:249- 
271:277- 
299:305- 
327:333- 
355:361- 

JOJ.J07- 

411-417- 
439:445- 
467:473- 
495:501- 
523:529- 
551:557-579 


229 


SCAN 


SCAN domain 


2.4e-52 


187.3 


1 


36-132 


229 


zf-C2H2 


Zinc finger, C2H2 type 


2.4e-51 


184.0 


7 


348-370:375- 

397:403- 

425:431- 

453:459- 

480:486- 

508:514-537 


LdI 






6 6e-20 


79 6 


1 


1-146 






^Ji i /*1^nti H vt Iran cft»r a c a 
IN UWaGOUUy 111 ailMClaaC 

domain 




JJi/ 




67-174 


286 


zf-C2H2 


Zinc finger, C2H2 type 


2.8e-93 


323.3 


12 


118-140:146- 

168:174- 

196:202- 

224:230- 

252:258- 

280:286- 

308:314- 

336:342- 

364:370- 

392:398- 

420:426-448 


286 


KRAB 


KRAB box 


3.6e-38 


140.2 


1 


8-70 


287 


zf-C2H2 


Zinc finger, C2H2 type 


5.3e-124 


425.4 


17 


183-205:211- 

233:239- 

261:267- 

289:295- 

317:323- 

345:351- 

373:379- 
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401:407- 
429:435- 
457:463- 
485:491- 
513:519- 
541:547- 
569:575- 
597:603- 
625:631-653 


289 


DiHfoiate_red 


Dihydrofolate reductase 


7.4e-77 


268.8 


1 


4-185 


291 


PDZ 


PDZ domain (Also known as 
DHR or GLGF) 


7.4e-17 


69.4 


1 


5-84 


293 


PH 


PH domain 


1.4e-08 


35.5 


1 


44-147 


294 


adh_short 


short chain dehydrogenase 


3.9e-29 


110.2 


1 


36-284 


297 


PKD 


PKD domain 


9.9e-09 


42.4 


2 


663-753:756- 
839 


297 


BNR 


BNR repeat 


3.2e-06 


34.1 


5 


115-126:156- 
167:351- 
362:428- 
439:470-481 


300 


HMG_box 


HMG (high mobility group) 
box 


5.4e-05 


20.0 


1 


245-304 


301 


ig 


Immunoglobulin domain 


0.05 


11.6 


1 


629-688 


302 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


5e-12 


43.2 


1 


39-79 


303 


START 


START domain 


0.015 


4.1 


1 


1790-1994 


304 


integrase 


Integrase DNA binding 
domain 


73e-06 


32.9 


1 


51-96 


305 


myosin_head 


Myosin head (motor domain) 


7.6e-279 


939.7 


2 


11-668:689- 
733 


306 


zf-C2H2 


Zinc finger, C2H2 type 


8.5e-54 


192.1 


7 


66-88:94- 
116:122- 
144:150- 
172:178- 
200:280- 
303:317-339 


307 


ig 


Immunoglobulin domain 


0.00023 


19.1 


2 


35-104:136- 
194 


309 


ras 


Ras family 


0.00079 


-93.3 


1 


38-176 


310 




Immunoglobulin domain 


2.1e-06 


25.7 


1 


37-112 


311 


EF1BD 


EF-1 guanine nucleotide 
exchange domain 


4.7e-56 


199.6 


1 


139-225 


312 


BTB 


BTB/POZ domain 


8.4e-25 


95.8 


1 


51-164 


313 


zf-C2H2 


Zinc finger, C2H2 type 


7.7e-59 


208.9 


9 


118-140:197- 

219:281- 

303:309- 

331:337- 

359:365- 

387:393- 

415:421- 

443:449-471 


313 


KRAB 


KRAB box 


1.4e-17 


71.8 


1 


41-99 
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314 


Hydrolase 


haloacid dehalogenase-Iike 
hydrolase 


0.045 j 


8.2 


1 


213-671 


315 


cNMPJrinding 


Cyclic nucieoude-binding 
domain 


4e-26 


100.2 


1 


387-475 


315 


ionjrans 


Ion transport protein 


3.8e-19 


77.0 


1 


69-290 


316 


Peptidase_S26 


Signal peptidase I 


2.8e-16 


56.3 


2 


38-98:117- 
139 


317 


zf-C2H2 


Zinc finger, C2H2 type 


4.3e-56 


199.8 


9 


156-178:184- 

206:212- 

234:240- 

262:268- 

290:296- 

318:324- 

346:352- 

374:378-400 


317 


KRAB 


KRAB box 


6.7e-16 


66.3 


1 


11-73 


319 


UPF0073 


Uncharacterised protein 
family 


1.8e-09 


27.9 


1 


33-276 


320 


EGF 


EGF-like domain 


4.7e-08 


40.2 


1 


26-59 


321 


lectin_c 


Lectin C-type domain 


8.6e-15 


62.6 


1 


268-374 


325 


MAM 


MAM domain j 


1.3e-52 


188.2 


1 


338-503 


325 


ig 


Immunoglobulin domain 


1.9e-15 


54.8 


3 


41-101:138- 
202:346-420 


327 


MAM 


MAM domain 


5.3e-180 


611.4 


4 


26-169:170- 

329:342- 

498:509-666 


328 


Sema 


Sema domain 


1.5e-211 


716.2 


1 


56-491 


329 


zf-C2H2 


Zinc finger, C2H2 type 


1.5e-84 


294.3 


13 


170-192:198- 

220:226- 

248:254- 

276:282- 

304:310- 

332:338- 

360:366- 

388:394- 

416:422- 

444:450- 

472:478- 

500:506-528 


331 


PAP2 


PAP2 superfamily 


8e-22 


85.9 


1 


160-314 


332 


LRR 


Leucine Rich Repeat 


3.4e-36 


133.7 


11 


58-81:82- 

105:106- 

129:130- 

153:154- 

177:178- 

201:202- 

225:250- 

273274- 

297:298- 

321:322-345 


332 




Immunoglobulin domain 


2.5e-08 


31.9 


1 


425-485 


332 


LRRNT 


Leucine rich repeat N- 


2.5e-05 


31.1 


1 


27-56 
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terminal domain 










332 


LRRCT 


Leucine rich repeat C- 
terminal domain 


0.0029 


24.3 


1 


355-408 


333 


AdoHcyase 


S-adenosyi-L-homocysteine 
hydrolase 


1.5e-280 


945.4 


1 


214-640 


334 


TBC 


TBC domain 


9.4e-38 


138.9 




89-302 


341 


WD40 


WD domain, G-beta repeat 


0.00094 


25.9 


\ 


2-32:109-146 


342 


ABC1 


ABC1 family 


0.051 


-29.9 


1 


3-50 


344 


globin 


Globin 


3e-45 


162.2 


1 


1-141 


345 


globin 


Globin 


7.5e-39 


139.9 




1-31:68-179 


347 


F-box 


F-box domain 


1.5e-07 


38.5 




24-72 


348 


HLH 


Helix-loop-helix DNA- 
binding domain 


2e-08 


41.4 


1 


83-137 


349 


KRAB 


KRAB box 


2.7e-39 


144.0 


! 


4-66 


350 


UCH-2 


Ubiquitin carboxyl-terminal 
hydrolase family 


1.7e-19 


78.2 




645-705 


350 


UCH-1 


Ubiquitin carboxyl-terminal 
hydrolases famil 


9.1e-15 


62.5 


i 


363-394 


350 


zf-UBP 


Zn-finger in ubiquitin- 
hydrolases and other 


0.00069 


18.9 




236-306 


351 


NUDK 


MutT-like domain 


8.2e-12 


52.7 


i — 


50-200 


352 


IBR 


IBR domain 


1.6e-12 


55.0 




101-166 


353 


IBR 


IBR domain 


1.6e-12 


55.0 


i — 


66-131 


354 


SCP 


SCP-like extracellular protein 


1.4e-34 


128.3 




56-208 


356 


mito_carr 


Mitochondrial carrier protein 


9.7e-78 


271.7 




10-125:127- 
220:232-321 


358 


UCH-1 ' 


Ubiquitin carboxyl-terminal 
hydrolases famil 


5.1e-15 


63.3 




323-354 


358 


zf-UBP 


Zn-finger in ubiquitin- 
hydrolases and other 


0.00049 


19.4 


■ 


195-264 


360 


Phagejysozyme 


Phage lysozyme 


0.0014 


23.4 




94-184 


362 


Ribosomal_S2 


Ribosomal protein S2 


3.3e-08 


32.9 




20-62 


364 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


5.3e-09 


33.4 




291-329 


365 


zf-C3HC4 


Zinc finger, C3HC4 type 
(RING finger) 


0.0096 


13.1 




109-148 


367 


TPR 


TPR Domain 


0.043 


20.4 




1-28 


370 


zf-C2H2 


Zinc finger, C2H2 type 


5.3e-109 

r 


375.5 


14 


127-149:155- 

177:183- 

205:211- 

233:239- 

261:267- 

289:295- 

317:323- 

345:351- 

373:379- 

401:407- 

429:435- 

457:463- 

485:491-513 


370 


SCAN 


SCAN domain 


4.2e-38 


140.0 


1 


27-122 


371 


arf 


ADP-ribosylation factor 


4.9e-39 


143.1 


1 


6-184 
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* 






Domains 








family 










371 


ras 


Ras family 


7.2e-06 


-70.1 


1 


22-186 


372 


BNR 


BNR repeat 


0.031 


20.9 


3 


171-182:244- 














255:295-306 


373 


zf-C2H2 


Zinc finger, C2H2 type 


8.3e-25 


95.8 


5 


142-162:171- 














198:204- 














228:234- 














258:264-288 


376 


rrm 


RNA recognition motif. 


0.00019 


28.2 


1 


112-163 


377 


rrm 


RNA recognition motif. 


2.2e-19 


77.9 


1 


112-183 


380 


vwc 


von Willebrand factor type C 


1.6e-31 


118.2 


3 


22-76:79- 






domain 








134:137-192 


381 


RibosomaI_L35Ae 


Ribosomal protein L35Ae 


0.00013 


7.0 


1 


1-79 


385 


ras 


Ras family 


3.9e-63 


223.2 


1 


35-229 


385 


arf 


ADP-ribosylation factor 


1.7e-05 


-46.9 


1 


18-202 






family 










388 


F-box 


F-box domain 


1.5e-05 


31.9 


2 


23-70:99-146 


390 


SPRY 


SPRY domain 


6.2e-10 


46.4 


1 


101-239 


391 


tRNA_Me_trans 


tRNA methyl transferase 


1.9e-19 


50.9 


1 


5-185 


392 


zf-C2H2 


Zinc finger, C2H2 type 


4e-17 


70.3 


3 


175-197:203- 












225:231-253 


393 


SCAN 


SCAN domain 


3.1e-39 


143.8 


1 


389-484 


393 


SPRY 


SPRY domain 


L8e-19 


78.1 


1 


148-273 


393 


zf-C2H2 


Zinc finger, C2H2 type 


4e-09 


43.7 


2 


759-781:787- 












809 


393 


zf-C3HC4 


Zinc finger, C3HC4 type 


0.0032 


14.7 


1 


11-52 






(RING finger) 










394 


Kelch 


Kelch motif 


4e-53 


189.9 


5 


329-375:377- 














431:433- 














479:481- 














525:527-572 


394 


BTB 


BTB/POZ domain 


6.1e-26 


99.6 


1 


30-144 


395 


C2 


C2 domain 


2.2e-80 


280.4 


2 


159-251:296- 














384 


396 


ank 


Ank repeat 


5.6e-33 


123.0 


4 


47-79:80- 














112:140- 














174:175-207 


396 


PH 


PH domain 


8.9e-05 


22.0 


1 


236-334 


397 


ank 


Ank repeat 


1.7e-26 


101.4 


4 


17-49:50- 












82:83- 














115:116-148 


398 


Nucleoplasms 


Nucleopiasmin 


3.6e-29 


110.4 


1 


13-209 


400 


DAGKa 


Diacylglycerol kinase 


1.9e-124 


426.8 


1 


598-778 






accessory domain 










400 


DAGKc 


Diacylglycerol kinase 


7,le-67 


235.6 


1 


454-578 






catalytic domain 










400 


DAGJPE-bind 


Phorbol esters/diacylglycerol 


2.9e-23 


90.7 


2 


261-310:326- 






binding dom 








374 


400 


efhand 


EFhand 


2.4e-12 


54.4 


2 


169-197:214- 














242 


403 


PDZ 


PDZ domain (Also known as 


7.7e-46 


165.7 


3 


86-166:210- 
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Domains 








DHRorGLGF) 








291:821-907 


404 


zf~C2H2 


Zinc finger, C2H2 type 


2.6e-48 


173.9 


7 


172-194:200- 












222:228- 














250:256- 














278:284- 














306:312- 














331:340-362 


405 


KJetra 


K+ channel tetramerisation 


2.6e-23 


90.9 


1 


51-146 






domain 










406 


SNF 


Sodiumrneuro transmitter 


0 


1268.7 


1 


60-657 






symporter family 










407 


ig 


Immunoglobulin domain 


l.le-06 


26.5 


1 


53-120 


408 


DnaJ 


DnaJ domain 


2.3e-27 


104.3 


1 


4-68 


408 


DnaJ_C 


DnaJ C terminal region 


3.le-08 


38.1 


1 


192-314 


409 


mito_carr 


Mitochondrial carrier protein 


1.4e-57 


204.7 


3 


5-100:102- 












201:205-302 


410 


zf-C2H2 


Zinc finger, C2H2 type 


5.2e-97 " 


335.7 


12 


141-163:169- 














191:197- 














219:225- 














247:253- 














275:281- 














303:309- 














331:337- 














359:365- 














387:393- 














415:421- 














443:449-473 


411 


SJOO 


S-100/ICaBP type calcium 


9.7e-13 


55.8 


1 


5-48 






binding domain 










411 


efhand 


EFhand 


0.0012 


25.6 


1 


54-82 


413 


fn3 


Fibronectin type HI domain 


8.6e-14 


59.3 


2 


22-107:119- 














196 


413 


PHD 


PHD-finger 


9.6e-05 


27.2 


1 


285-341 


414 


zf-C2H2 


Zinc finger, C2H2 type 


2.3e-27 


104.4 


6 


42-64:117- 












139:145- 














167:173- 














196:534- 














556:573-595 


415 


SPRY 


SPRY domain 


3.9e-18 


73.7 


1 


347-467 


415 


zf-C3HC4 


Zinc finger, C3HC4 type 


4,4e-14 


49.9 


1 


16-56 






(RING finger) 










415 


zf-B_box 


B-box zinc finger 


9e-07 


35.9 


1 


92-133 


416 


pkinase 


Protein kinase domain 


1.2e-54 


195.0 


1 


97-317 


417 


trypsin 


Trypsin 


4.6e-38 


122.5 


1 


41-234 


418 


Glypican 


Glypican 


5.7e-131 


448.5 


1 


3-244 


419 


Keratin J32 


Keratin, high sulfur B2 


0.0013 


-23.4 


1 


37-159 






protein 










420 


Dyneinjieavy 


Dynein heavy chain 


0 


1432.3 


1 


309-1019 


421 


zf-C2H2 


Zinc finger, C2H2 type 


0.00039 


27.2 


3 


75-99:203- 












227:266-290 


422 


ig 


Immunoglobulin domain 


0.00074 


17.5 


1 


34-107 


423 


fh3 


Fibronectin type HI domain 


6e-08 


39.8 


1 


443-531 
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424 


Keratin_B2 


Keratin, high sulfur B2 
protein 


0.0023 


-27.1 


2 


5-150:152- 
251 


425 


pkinase 


Protein kinase domain 


2.3e-55 


197.3 


1 


69-390 


426 




Immunoglobulin domain 


4.1e-09 


34.4 


1 


35-112 


427 


GalactosyLT 


Galactosyltransferase 


2.6e-35 


130.8 


1 


158-349 


428 


proteasome 


Proteasome A-type and B- 
type 


5.5e-28 


106.4 


1 


96-238 


429 


7tmJ 


7 transmembrane receptor 
(rhodopsin family) 


3.4e-38 


123.5 


1 


41-290 


430 


BTB 


BTB/POZ domain 


8.1e-23 


89.2 


1 


58-173 


430 


zf-C2H2 


Zinc finger, C2H2 type 


4.3e-07 


37.0 


2 


472-494:500- 
523 


433 


p450 


Cytochrome P450 


6.4e-175 


594.5 


1 


33-493 


434 


sugac.tr 


Sugar (and other) transporter 


2.6e-64 


227.1 


1 


10-512 v 


435 


zf-C2H2 


Zinc finger, C2H2 type 


1.8e-52 


187.8 


9 


287-309:315- 

337:546- 

568:574- 

596:606- 

628:844- 

866:872- 

894:980- 

1002:1008- 

1030 


436 


7tm_l 


7 transmembrane receptor 
(rhodopsin family) 


2.2e-40 


130.4 


2 


82-221:229- 
284 


437 


FGF 


Fibroblast growth factor 


4.6e-14 


51.6 


1 


48-129 


438 


Osteopontin 


Osteopontin 


3.7e-181 


615.2 


1 


1-294 
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PDB annotation 




LIGASE CBL, UBCH7, ZAP- 
70,E2,UBIQUITIN,E3, 1 
PHOSPHORYLATIONS ! 
TYROSINE KINASE, A 
UBIQUITINATION, PROTEIN^ 
DEGRADATION, | 


LIGASE CBL, UBCH7, ZAP- 
70,E2,UBIQUITIN,E3, 
PHOSPHORYLATION, 2 
TYROSINE KINASE, 
UBIQUITINATION, PROTEIN 
DEGRADATION, 


METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATI; RING FINGER 
(C3HC4) 


METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATI; RING FINGER V 
(C3HC4) . p| 


DNA-BINDING PROTEIN \J 
V(D) J RECOMBINATION A 
ACTIVATING PROTEIN 1; J 
RAGl, V(D)J K 
RECOMBINATION, W 
ANTIBODY, MAD, RING Q 
FINGER, 2 ZINC BINUCLEAIfj 
CLUSTER, ZINC FINGER, \ 
DNA-BINDING PROTEIN m 


DNA-BINDING PROTEIN ^ 
V(D)J RECOMBINATION £ 
ACTIVATING PROTEIN 1; 
RAG1,V(D)J f|J 
RECOMBINATION, fll 


Compound 


VIRUS-l(C3HC4,ORRING | 
DOMAIN) ICHC 3 (NMR, 1 
STRUCTURE) ICHC 4 


SIGNAL TRANSDUCTION j 
PROTEIN CBL; CHAIN: A; 
ZAP-70 PEPTIDE; CHAIN: 
B; UBIQUmN- i 
CONJUGATING ENZYME 
E12-18 KDA UBCH7; 
CHAIN: C; 


SIGNAL TRANSDUCTION 
PROTEIN CBL; CHAIN: A; 
ZAP-70 PEPTIDE; CHAIN: 
B; UBIQUnTN- 
CONJUGATING ENZYME 
E12-18 KDAUBCH7; 
CHAIN: C; 


CDK-ACTIVATING 
KINASE ASSEMBLY 
FACTOR MATI; CHAIN: A; 


CDK-ACTIVATING 
KINASE ASSEMBLY 
FACTOR MATI; CHAIN: A; 


RAGl; CHAIN: NULL; 


RAGl; CHAIN: NULL; 


SEQFOLD 
score 














j 60.57 


PMF 
score 




0.33 


0.93 


0.25 


0.51 


1.00 




Verify 
score 




0.12 
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PDB annotation 


BINDING/EFFECTOR), G 
PROTEIN, EFFECTOR, 
RABCDR, 2 SYNAPTIC 
EXOCYTOSIS,RAB 
PROTEIN, RAB3A, 


HYDROLASE G PROTEIN, M 
VESICULAR TRAFFICKING, S ' 
GTP HYDROLYSIS, RAB 2 
PROTEIN, 

NEUROTRANSMITTER 
RELEASE, HYDROLASE 


TRANSCRIPTION 
REGULATION SIGMA70; 
RNA POLYMERASE SIGMA 
FACTOR, TRANSCRIPTION 
REGULATION 


COMPLEX (BLOOD 
COAGULATION/INHIBITOR) 
AUTOPROTHROMBIN HA; 
HYDROLASE, SERINE 
PROTEINASE), PLASMA v 
CALCIUM BINDING, 2 H 
GLYCOPROTEIN, COMPLEX^ 
(BLOOD \ 
COAGUTATION/INHIBITORV^fl 


ANTI-COAGULANT ANTI- jZ 
COAGULANT, PEPTIDIC 2 
INHIBITORS, W 
CONFORMATIONAL 2 ■ R 
FLEXIBILITY, SERINE r S 
PROTEASE INHIBITOR H 


SUGAR BINDING PROTEIN y 
UDA; LECTIN, HEVEIN a 
DOMAIN, UDA, L 
SUPERANTIGEN J* 
SUGAR BINDING PROTEIN * 1 


Compound 




RAB3A; CHAIN: A; 




RNA POLYMERASE 


PRIMARY SIGMA 
FACTOR: CHAIN: NULL: 




ACTIVATED PROTEIN C; 
CHAIN: C, L; D-PHE-PRO- 
MAI; CHAIN: P; 


HIRUSTASIN; CHAIN: 




AGGLUTININ ISOLECTIN 
VI/AGGLUTININ 
ISOLECTIN V; CHAIN: A; 

AGGLUTININ ISOLECTIN 


SEQFOLD 
score 




169.98 


CO 
ON 

r- 


50.24 






PMF 
score 
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PDB annotation 


COMPLEMENT INHIBITOR 
VCP, SP35; COMPLEMENT, 
NMR, MODULES, PROTEIN ! 


STRUCTURE, VACCINIA 
VIRUS | 


MATRIX PROTEIN 1 
EXTRACELLULAR MATRIX Jl 
CALCIUM-BINDING, V 
GLYCOPROTEIN, 2 REPEAT, 
SIGNAL, MULTIGENE 
FAMILY, DISEASE 
MUTATION, 3 EGF-LKE 
DOMAIN, HUMAN 
FIBRILLIN- 1 FRAGMENT, 
MATRIX PROTEIN 


MATRIX PROTEIN 
EXTRACELLULAR MATRIX, 
CALCIUM-BINDING, 
GLYCOPROTEIN, 2 REPEAT, 
SIGNAL, MULTIGENE 
FAMILY, DISEASE 
MUTATION, 3 EGF-LIKE 
DOMAIN, HUMAN 
FIBRILLIN-1 FRAGMENT, ^ 
MATRIX PROTEIN m 


:T. 
11 

dg 

0 & 

38 

« U 


§ "Jill §P 
§33 |£rf&s§S 

to < 8 §3 ££ W < 


CELL ADHESION PROTEIN fL| 
EGF-LKE DOMAIN, CELL fO 


Compound 


COMPLEMENT CONTROL 
PROTEIN; CHAIN: A; 


FIBRILLIN; CHAIN: NULL; 


i 


BLOOD COAGULATION 
FACTOR VEIA; CHAIN: L; 
BLOOD COAGULATION 
FACTOR VHA; CHAIN: H; 
SOLUBLE TISSUE 
FACTOR; CHAIN: T; 5L15; 
CHAIN: I; 


P-SELECTIN; CHAIN: 


NULL; I 


SEQFOLD 
score 












PMF 
score 
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PDB annotation 


VIRAL PEPTIDE, 2 


K^\JLVlTluC,^ ^IVIXIV-/ VJUC^AL/ 

PEPTIDE/RECEPTOR 


COMPLEX (MHC/VIRAL 1 
PEPTIDE/RECEPTOR) HLA- 1 
A2 HEAVY CHAIN; CLASS I d 
MHC, T-CELL RECEPTOR, V 
VIRAL PEPTIDE, 2 N 
COMPLEX (MHC/VIRAL 
PEPTIDE/RECEPTOR 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN, 
ANTIBODY, FAB, ENZYME 
INHIBITOR, PCR, 2 HOT 
START 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN, 
ANTIBODY, FAB, ENZYME 
INHIBITOR, PCR, 2 HOT 
START 


T CELL RECEPTOR TCR; T 
CELL RECEPTOR, MHC 
CLASS I, HUMAN fj 
IMMUNODEFICIENCY 
VIRUS, 2 MOLECULAR H 
RECOGNITION j| 


T CELL RECEPTOR TCR; T M 
CELL RECEPTOR, MHC fj) 
CLASS I, HUMAN .p 
IMMUNODEFICIENCY m 
VIRUS, 2 MOLECULAR v 
RECOGNITION * 


COMPLEX (MHC/VIRAL f 
PEPTIDE/RECEPTOR) HLA ^ 
A2 HEAVY CHAIN; FU 
COMPLEX (MHC/VIRAL fij 
PEPTIDE/RECEPTOR) m 


Compound 


RECEPTOR ALPHA; 
CHAIN: D;T CELL 
RECEPTOR BETA; CHAIN: 
E; 


HLA-A 0201; CHAIN: A; 
BETA-2 MICROGLOBULIN; 
CHAIN: B; TAX PEPTIDE; 
CHAIN: C;TCELL 
RECEPTOR ALPHA; 
CHAIN: D; T CELL 
RECEPTOR BETA; CHAIN: 
E; 


TP7 FAB; CHAIN: L,H; 


TP7 FAB; CHAIN: L,H; 


T CELL RECEPTOR V- 


b 


T CELL RECEPTOR V- 
ALPHA DOMAIN; CHAIN: 
A.B; 


HLA-A 0201; CHAIN: A; 
BETA-2 MICROGLOBULIN; 
CHAIN: B; TAX PEPTIDE; 
CHAIN: C;T CELL 


RECEPTOR ALPHA; I 


SEQFOLD 
score 




54.73 




51.86 


67.06 






PMF 
score 






0.30 






0.92 


1.00 


Verify 
score 






0.24 
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PDB annotation 


GP120, T-CELL SURFACE 
GLYCOPROTEIN CD4, 3 
ANTIGEN-BINDING 
FRAGMENT OF HUMAN 
IMMUNOGLOBULIN 17B, 4 
GLYCOSYLATED PROTEIN 


IMMUNOGLOBULIN INTACT J 
IMMUNOGLOBULIN V " 
REGION C REGION, 
IMMUNOGLOBULIN 


COMPLEX 

(ANTIBODY/ANTIGEN) 
CYTOKINE RECEPTOR, 
COMPLEX 

(ANTIBODY/ANTIGEN), 2 

TRANSMEMBRANE, 

GLYCOPROTEIN 


COMPLEX 

(IMMUNOGLOBULIN/RECEP 
TOR) TCR VAPLHA VBETA 
DOMAIN; T-CELL 
RECEPTOR, STRAND 
SWITCH, FAB, 

ANTICLONOT YPIC, 2 *f ? 
(IMMUNOGLOBULIN/RECEP f ] 
TOR) ' 


S| 1 

w ^ P & w 


E 5 d ^ 
>* fi 

H j?f 6 


Compound 




IGG2A INTACT ANTIBODY 
- MAB231; CHAIN: A, B, C, 
D 


ANTIBODY A6; CHAIN: L, 
H; INTERFERON-GAMMA 
RECEPTOR ALPHA CHAIN; 
CHAIN: I; 


KB5-C20 T-CELL ANTIGEN 
RECEPTOR; CHAIN: A, B; 
ANTIBODY DESIRE-1 ; 
CHAIN: L,H; 


KB5-C20 T-CELL ANTIGEN 
RECEPTOR; CHAIN: A, B; 
ANTIBODY DESIRE-1; . 
CHAIN: L.H; 


MHC CLASS I HLA-A; 
CHAIN: A; BETA-2 
MICROGLOBULIN; CHAIN: 
B; TAX PEPTIDE P6A; 


SEQFOLD 
score 






50.28 


62.17 




71.85 


PMF 
score 




0.45 






0.11 




Verify 
score 




0.01 






0.24 
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PDB annotation 


RECEPTOR, IMMUNE 
SYSTEM 


IMMUNE SYSTEM HUMAN 
TCR/PEPTIDE/MHC 
COMPLEX, HLA-A2, HTLV-1, 
TAX, TCR, T 2 CELL 
RECEPTOR, IMMUNE 
SYSTEM 


COMPLEX (COAT 
PROTEIN/IMMUNOGLOBULI 
N) POLYPROTEIN, COAT 
PROTEIN, CORE PROTEIN, 
RNA-DIRECTED RN A 2 
POLYMERASE, 
HYDROLASE, THIOL 
PROTEASE, 
MYRISTYLATION, 3 
COMPLEX (COAT 
PROTEIN/IMMUNOGLOBULI 


RECEPTOR TCR; T-CELL, 
RECEPTOR, 
TRANSMEMBRANE, 
GLYCOPROTEIN, SIGNAL 








AMINOPEPTIDASE 
AMINOPEPTIDASE, 
PROLINE IMINOPEPTIDASE, 
SERINE PROTEASE, 2 


Compound 


CHAIN: C; HMAN T-CELL 
RECEPTOR; CHAIN: D; 
HLA-A 0201; CHAIN: E; 


MHCCLASSIHLA-A; 
CHAIN: A; BET A-2 
MICROGLOBULIN; CHAIN: 
B; TAX PEPTIDE P6A; 
CHAIN: C; HMAN T-CELL 
RECEPTOR; CHAIN: D; 
HLA-A 0201; CHAIN: E; 


HUMAN RHINOVIRUS 14 
COAT PROTEIN; CHAIN: 1, 
2, 3, 4; FAB 17-IA; CHAIN: 

! L,H 

i 


ALPHA, BETA T-CELL 
RECEPTOR CHAIN: A, B; 


IMMUNOGLOBULIN 
IMMUNOGLOBULIN FAB 
2FB44 


IMMUNOGLOBULIN FAB 
FRAGMENT FROM 


IMMUNOGLOBULIN IGGl 
(LAMBDA, HIL)8FAB 3 




PROLINE 

IMINOPEPTIDASE; CHAIN: 
A,B; 


SEQFOLD 
score 






51.58 


55.23 




50.44 






PMF 
score 




1.00 






0.55 






0.30 


Verify 
score . 




0.38 
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PDB annotation 


| PSD-95; PDZ DOMAIN, 
NEURONAL NITRIC OXIDE 
SYNTHASE, NMDA 
RECEPTOR 2 BINDING 




COMPLEX (BLOOD 1 
COAGULATION/INHIBITOR) J 
AUTOPROTHROMBINnA; \ 
HYDROLASE, SERINE 
PROTEINASE), PLASMA 
CALCIUM BINDING, 2 
GLYCOPROTEIN, COMPLEX 
(BLOOD 

COAGULATION/INHIBITOR) 


HYDROLASE INHIBITOR 
ALL-BETA STRUCTURE, 
HYDROLASE INHIBITOR 


HYDROLASE INHIBITOR 
ALL-BETA STRUCTURE, 
HYDROLASE INHIBITOR 


PLANT PROTEIN TWO 
HOMOLOGOUS HE VEIN- 
LIKE DOMAINS 


PLANT PROTEIN TWO % 
HOMOLOGOUS HEVEIN- f 
LIKE DOMAINS 


PLANT PROTEIN TWO , ] 
HOMOLOGOUS HEVEIN- 4 
LIKE DOMAINS €f 


PLANT PROTEIN TWO U 
HOMOLOGOUS HEVEIN- £ 
LIKE DOMAINS H 


PLANT PROTEIN TWO 
HOMOLOGOUS HEVEIN- « 
LIKE DOMAINS r 


SUGAR BINDING PROTEIN ^ 
UDA; LECTIN, HEVEIN fl 
DOMAIN, UDA, fl 
SUPERANTIGEN fl 


Compound 


< 

5 

c 
% 


* 
* 




ACTIVATED PROTEIN C; 
CHAIN: C, L; D-PHE-PRO- 
MAI; CHAIN: P; 


BOWMAN-BIRK TRYPSIN 
INHIBITOR; CHAIN: A 


BOWMAN-BIRK TRYPSIN 
TNTHTRTTOR: CHAIN: A 


* 

! 


AGGLUTININ ISOLECTIN 
VI;CHAIN:A 


AGGLUTININ ISOLECTIN 
VI; CHAIN: A 


1 

i < 

<> 




AGGLUTININ ISOLECTIN 
VI; CHAIN: A 


AGGLUTININ ISOLECTIN 
VI; CHAIN: A 


AGGLUTININ ISOLECTIN 
WAGGLUTININ 
ISOLECTIN V; CHAIN; A; 


SEQFOLD 
score 
























PMF 
score 






-0.17 


-0.07 


-0.12 


o 

CO 

© 


0.48 


-0.18 


0.35 


0.11 


-0.17 


Verify 
score 






0.15 


oro 


0.84 


0.80 


2.11 


1.23 


0.75 


0.22 


0.98 
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Blast 
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o 
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00 


CO 


VO 


r* 
oo 


START 
AA 






o 


m 
cn 




o 




CO 




o 




CHAIN 
ID 








< 


< 


< 


< 


< 


< 


<: 


< 








laut 


<3 




lehd 


lehd 


lehd 


lehd 


lehd 


.S3 

0) 


SEQID 
NO: 






OS 
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Os 
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VO 


Os 
VO 


Os 
VO 


OS 

VO 


Os 


Os 
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PDB annotation 


BINDING PROTEIN, 
CYTOKINE, SIGNALLING 
. PROTEIN 


HORMONE RECEPTOR 
HORMONE RECEPTOR, 
INSULIN RECEPTOR 
FAMILY ^ 


GLYCOPROTEIN 
GLYCOPROTEIN 


GLYCOPROTEIN 
GLYCOPROTEIN 


MEMBRANE ADHESION 


SHORT CONSENSUS 
REPEAT, SUSHI, 


COMPLEMENT CONTROL 
PROTEIN, 2 N- 
GLYCOSYLATION, MULTI- 
DOMAIN, MEMBRANE 


ADHESION 


SERINE PROTEASE 
INHIBITOR FACTOR XA 
INHIBITOR; ANTISTASIN, 
CRYSTAL STRUCTURE, f 
FACTOR XA INHIBITOR, 2 £ 
SERINE PROTEASE 
INHffilTOR, THROMBOSIS 


SERINE PROTEASE J 
INHIBITOR FACTOR XA S 1 
INHIBITOR; ANTISTASIN, 
CRYSTAL STRUCTURE, £ 
FACTOR XAINHTOITOR, 2 n 
SERINE PROTEASE v 
INHIBITOR, THROMBOSIS _I 


SERINE PROTEASE ~ 
INHIBITOR FACTOR XA N 
INHIBITOR; ANTISTASIN, fl 
CRYSTAL STRUCTURE, p L 
FACTOR XA INHIBITOR, 2 p 


Compound 


TUMOR NECROSIS 
FACTOR RECEPTOR; 
CHAIN: A, B; 


INSULIN-LIKE GROWTH 
FACTOR RECEPTOR 1; | 
CHAIN: A; 


LAMININ; CHAIN: NULL; 


LAMININ; CHAIN: NULL; 


HUMAN BETA2- 
GLYCOPROTEIN I; CHAIN: 

A: 


■ 


ANTISTASIN; CHAIN: 
NUIX; 


ANTISTASIN; CHAIN: • 
NULL; 




1 ANTISTASIN; CHAIN: 


NOLL; 




SEQFOLD 
score 


54.40 






r- 






CM 

OO • 




PMF 
score 




-0.19 


-0.03 




-0.20 


-0.17 




-0.05 


Verify 
score 




0.60 


0.84 




0.54 


0.40 




0.02 


Psi 
Blast 


6e-14 


CM 

cm 


1 
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VO 
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i— i 


CM 
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t-H 




START 
AA 
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CO 






cn 


cn 

CM 




CHAIN 
ID 


< 
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lext 


ligr 


Iklo 
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lqub 

i 


Iskz 


Iskz 


Iskz 


SEQID 
NO: 


ON 
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PDB annotation 


| RECEPTOR I 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN i| 
(IG)ULKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IOUKE 
DOMAINS, B-TREFOIL FOLD 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
(IG)LIKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LIKE 
DOMAINS, B-TREFOIL FOLD 


IMMUNE SYSTEM FC- 
EPSILONRI- ALPHA; 
IMMUNOGLOBULIN FOLD, 
GLYCOPROTEIN, 
RECEPTOR, IGE-BINDING 2 
PROTEIN n\ 


IMMUNE SYSTEM HIGH f j 
AFFINITY IGE-FC s \ 
RECEPTOR; FC(EPSILON) 
IGE-FC; IMMUNOGLOBULIN '*| 
FOLD, GLYCOPROTEIN, t ™ 
RECEPTOR, IGE-BINDING 2 fjj 
PROTEIN, IGE ANTIBODY, Jj 
IGE-FC ' ni 


IMMUNE SYSTEM, K' 
MEMBRANE PROTEIN 0)32;^ \ 
FC RECEPTOR, p 
IMMUNOGLOULIN, H 
LEUKOCYTE, CD32 ill 


IMMUNE SYSTEM CD32; rfj 
RECEPTOR, FC, CD32, $ 


Compound 




NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
C,D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G,H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G,H; 


HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; 


HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; IG EPSILON 
CHAIN C REGION; CHAIN: 
B,D; 


FC RECEPTOR 
FC(GAMMA)RIIA; CHAIN: 
A; 


f 


| SEQFOLD 
score 


















PMF 
score 




0.92 


0.31 


0.65 


0.34 


0.27 


0.29 


0.71 


Verify 
score 




0.34 


0.12 


0.12 


d 


«— * 
9 


0.04 


0.33 


Psi 
Blast 
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NO 1 
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s 

cs 




cn 
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cn 


CN 


START 
AA 




cn 


00 
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a 


cn 


cn 


cn 




CHAIN 
ID 
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< 
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1 
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SEQID 
NO: 
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PDB annotation 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN || 


COMPLEX (ZINC 
1 FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


1 COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING % 
PROTEIN f? 


COMPLEX (ZINC 71 
FINGER/DNA) COMPLEX jjfl 
(ZINC FINGER/DNA), ZINC G 1 
FINGER, DNA-BINDING \$ 
PROTEIN g 


COMPLEX (ZINC K 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING H 
PROTEIN ft. 


COMPLEX (ZINC * pj 


Compound 

BINDING SITE; CHAIN: B, 


C; 

QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 


SEQ FOLD 
score 
















PMF 
score 


8 
»-< 


-0.06 


0.53 


0.00 


0.19 


0.21 


I 0.63 


Verify 
score 


-0.41 

i 


0.10 


0.02 


-0.37. 


9 


-0.44 


0.11 


Psi 
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u 

CO 


00 


00 
00 

vo 


ON 
00 


CO 
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8 
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CO 
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3 

00 


o 
oo 
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AA 


o 

00 


?! 


CO 
CO 


1— t 






o\ 

00 


CHAIN 
ID 


< 


< 


< 


< 


< ' 


< 


< 


gB 

Pi 


lalh 


i lalh 


lalh 


lalh 


lalh 


lalh 


lalh 


SEQID 
NO: 
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PDB annotation 


FGFR, IMMUNOGLOBULIN- 
LIKE, SIGNAL 
TRANSDUCTIONS 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 


3 


GROWTH FACTOR/GROWTH* 
FACTOR RECEPTOR FGF, >> 
FGFR, IMMUNOGLOBULIN- 
LIKE, SIGNAL 
TRANSDUCTION, 2 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR . 
RECEPTOR 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN 


CELL ADHESION NCAM; % 
NCAM, IMMUNOGLOBULIN^ 
FOLD, GLYCOPROTEIN _j 


GROWTH FACTOR/GROWTH \ 
FACTOR RECEPTOR FGF2; J 
FGFR2; IMMUNOGLOBULINS 
(IG)LIKE DOMAINS If! 
BELONGING TO THE I-SET C 
SUBGROUP WITHIN IG-LIK^ 
DOMAINS, B-TREFOIL FOLD, 


GROWTH FACTOR/GROWTHS* 
FACTOR RECEPTOR FGF2; £ 
FGFR2; IMMUNOGLOBULIN^ 
(IG)LIKE DOMAINS R 
BELONGING TO THE I-SET 2|i 
SUBGROUP WITHIN IG-LIKg* 


Compound 


FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: C,D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: CD; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
CD; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
CD; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
CD; 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
CD; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G,H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E, F, G, H; 


SEQFOLD 
score 


















PMF 
score 




; 0.29 


0.25 


o 


0.07 


0.18 


0.15 


0.17 


Verify 
score 




o 
o 


0.11 


8 

9 


0.05 
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cs 

9 
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so 
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cs 
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AA 






CO 
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i—i 


s 
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ID 




U 


< 


< 


< 


< 


W 


o 






lcvs 
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lepf 
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? 

4) 
1-1 


SEQID 
NO: 
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PDB annotation 


DOMAINS, B -TREFOIL FOLD | 


IMMUNE SYSTEM FC- 
EPSILON RI- ALPHA; 
IMMUNOGLOBULIN FOLD, 
GLYCOPROTEIN, 
RECEPTOR, IGE-BINDING 2 
PROTEIN A 


IMMUNE SYSTEM FC- \ 
EPSILON RI-ALPHA; 
IMMUNOGLOBULIN FOLD, 
GLYCOPROTEIN, 
RECEPTOR, IGE-BINDING 2 
PROTEIN j 


| IMMUNE SYSTEM FC- 
EPSILON RI-ALPHA; 
IMMUNOGLOBULIN FOLD, 
GLYCOPROTEIN, 
RECEPTOR, IGE-BINDING 2 
PROTEIN 


IMMUNE SYSTEM HIGH 
AFFINITY IGE-FC 
RECEPTOR, FCCEPSILON) 
IGE-FC; IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN, «g 
RECEPTOR, IGE-BINDING 2 1 
PROTEIN, IGE ANTIBODY, 5 J 
IGE-FC 


g § 8 g |j ° 

5 ri fr< 8 O Oft C 


IMMUNE SYSTEM, H 
MEMBRANE PROTEIN CD32&J 
FC RECEPTOR, qj 
IMMUNOGLOULIN, ^ 


Compound 




HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; 


HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; 


HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; 


HIGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; IG EPSILON 
CHAIN C REGION; CHAIN: 
B,D; . 


fflGH AFFINITY 
IMMUNOGLOBULIN 
EPSILON RECEPTOR 
CHAIN: A; IG EPSILON 
CHAIN C REGION; CHAIN: 
B,D; 


FC RECEPTOR 
FC(GAMMA)RIIA; CHAIN: 
A; 


SEQFOLD 
score 
















PMF 
score 




0.71 


0.36 


0.30 


0.72 


0.88 


0.22 


Verify 
score 




0.31 


0.36 


0.30 
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0.25 


0.34 
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PDB annotation 


TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 i| 
TRANSCRIPTION f 
INITIATION, ZINC FINGER 
PROTEIN 1 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YTNG- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR f \ 
ELEMENT, YYl, ZINC 2 ~ j 
FINGER PROTEIN, DNA- * j 


IS 

11 


COMPLEX (TRANSCRIPTION [ 
REGULATION/DNA) YING- gj 
YANG 1; TRANSCRIPTION nj 
INITIATION, INITIATOR v 
ELEMENT, YYl, ZINC 2 ; 
FINGER PROTEIN, DNA- j» 1 
PROTEIN RECOGNITION, 3 f*' 
COMPLEX (TRANSOUFTIO*?? 1 
REGULATION/DNA) fl| 


f 

1 


Compound 




co g 

Q O 

u i « 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


I YYl; CHAIN: C; ADENO- 


5QFOLD 
score 








90.80 


















PMF 
score 




0.62 


1.00 




0.53 


0.06 


Verify 
score 
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PDB annotation 




MUSCLE PROTEIN MDE: 1 


MUSCLE PROTEIN 


MUSCLE PROTEIN MDE; 
MUSCLE PROTEIN 


MUSCLE PROTEIN MUSCLE 1 
PROTEIN A 


MUSCLE PROTEIN MUSCLE 
PROTEIN 


CONTRACTILE PROTEIN 
MYOSIN MOTOR, 
CONFORMATIONAL 
CHANGES 


CONTRACTILE PROTEIN 
MYOSIN, DICTYOSTELIUM, 
MOTOR, MANT, ATPASE, 
ACTIN-BINDING, 2 COILED 
COIL 


CONTRACTILE PROTEIN 


MYOSIN, DICT YOSTbLlUM, 
MOTOR, MANT, ATPASE, 
ACTIN-BINDING, 2 COILED 
COIL V 


CONTRACTILE PROTEIN f } 
ATPASE, MYOSIN, COILED 
COIL, ACTIN-BINDING, ATP-,, 1 
BINDING, 2 HEPTAD Jj 
REPEAT PATTERN, ^| 
METHYLATION, Ul 
ALKYLATION, 3 CI 
PHOSPHORYLATION, fj] 
CONTRACTILE PROTEIN 


■Ills - 

9 2 2 1 1 § 

H g g <N < ^ 

8 < 8 PP 2 S 


Compound 


MYOSIN ESSENTIAL 
LIGHT CHAIN; CHAIN: Z 


MYOSIN; CHAIN: A,B,C, 
D, E, F, G, H; 


MYOSIN; CHAIN:A,B,C, 
D,E,F,G,H; 


MYOSIN; CHAIN: A, B, C, 
D.E.F; 


MYOSIN; CHAIN: A, B, C, 
D,E,F; 


MYOSIN HEAD; CHAIN: A; 
MYOSIN HEAD; CHAIN: Y; 
MYOSIN HEAD; CHAIN: Z; 


MYOSIN; CHAIN: NULL; 


MYOSIN; CHAIN: NULL; 




MYOSIN; CHAIN: NULL; 


MYOSIN; CHAIN: NULL; 




SEQFOLD 
score 






526.30 . 




489.18 




496.50 




425.46 




PMF 
score 




j LOO 




1.00 




1.00 




1.00 




LOO 


Verify 
score 




0.61 




0.65 




0.33 




0.48 




0.40 


Psi 
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. ... 
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PDB annotation 


CELL DIVISION, MITOSIS, 
PHOSPHORYLATION 


TRANSFERASE JNK3; 
TRANSFERASE, JNK3 MAP 
KINASE, 

SERINE/THREONINE 
PROTEIN 2 KINASE 


KINASE KINASE, TWITCHIN, J 
INTRASTERIC REGULATION f 


KINASE KINASE. TWITCHIN. 1 


INTRASTERIC REGULATION 


KINASE KINASE, TWITCHIN, 
INTRASTERIC REGULATION 


; KINASE KINASE, TWITCHIN, 
i INTRASTERIC REGULATION 


TRANSFERASE MITOGEN 
ACTIVATED PROTEIN 
KINASE; TRANSFERASE, 
MAP KINASE, 


SERINE/THREONINE- 
PROTEIN KINASE, 2 P38 


KINASE RABBIT MUSCLE 
PHOSPHORYLASE KINASE; 
GLYCOGEN METABOLISM, 1) 
TRANSFERASE, f | 
SERINE/THREONINE- n j 
PROTEIN, 2 KINASE, ATP- C 1 
BINDING, CALMODULIN- fl j 
BINDING % 


KINASE RABBIT MUSCLE VI 
PHOSPHORYLASE KINASE; Z) 
GLYCOGEN METABOLISM, fi) 
TRANSFERASE, v 
SERINE/THREONINE- f j 
PROTEIN, 2 KINASE, ATP- J[ * 
BINDING, CALMODULIN- t 
BINDING Pi 


SERINE KINASE SERINE fl| 
KINASE, THIN. MUSCLE, fl ! 


Compound 




C-JUNN-TERMINAL 




TWITCHIN; CHAIN: NULL; 1 




TWITCHIN; CHAIN: A, B; 


« 


<f 

H 


MAP KINASE P38; CHAIN: 




1 


1 
2 


PHOSPHORYLASE 
KINASE; CHAIN: NULL; 




< 




SEQFOLD 
score 
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139.53 






119.80 




170.32 
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1.00 
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PDB annotation 


EUKARYOTIC INITIATION 
FACTOR 4A;IF4A, 
HELICASE, DEAD-BOX 
PROTEIN 


CHAPERONE/STRUCTURAL 
PROTEIN CHAPERONE 
ADHESIN DONOR STRAND 
COMPLEMENTATION, 2 
CHAPERONE/STRUCTURAL 
PROTEIN 

CHAPERONE/STRUCTURAL 


PROTEIN CHAPERONE 
ADHESIN DONOR STRAND 
COMPLEMENTATION, 2 
CHAPERONE/STRUCTURAL 
PROTEIN 




HYDROLASE HYDROLASE, 
DEPHOSPHORYLATION 


HYDROLASE PTPIB; 
HYDROLASE, 
PHOSPHORYLATION, 
LIGAND, INHIBITOR 


HYDROLASE C2 DOMAIN, f 
PHOSPHOTK)YLIN0SIT0L, ' 
PHOSPHOTASE, { 
HYDROLASE . 


HYDROLASE PROTEIN- Ml 
TYROSINE PHOSPHATASE; ( 
HYDROLASE, PROTEIN ( 
TYROSINE PHOSPHATASE, f 
CATALYTIC DOMAIN, 2 
WPD LOOP, SH2 DOMAIN , 


HYDROLASE DUAL * : 
SPECIFICITY V 
PHOSPHATASE, MAP f 
KINASE HYDROLASE f 


HYDROLASE DUAL || 


Compound 


FACTOR 4A; CHAIN: A, B ; 


ig h i 
|^|| | 

9 y "sT § 1 £ 

§ 1 s . e s % 


FIMC; CHAIN: A, C, E, G, I, 
K, M, O; MANNOSE- 
SPECMC ADHESIN FIMH; 
CHAIN: B, D, F, H, J, L, N, P; 




PROTEIN TYROSINE 
PHOSPHATASE IB; 
CHAIN: NULL; 


PROTEIN-TYROSINE 
PHOSPHATASE IB; 
CHAIN: A; 


PHOSPHOINOSITIDE 
PHOSPHOTASE PTEN; 
CHAIN: A; 


SHP-1; CHAIN: NULL; 


PYSTl; CHAIN: NULL; 


PYSTl; CHAIN: NULL; 


SEQFOLD 
score 






















PMF 
score 




9 9 
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0.01 
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CM 




START 
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1227 
1241 
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oo 
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I— 4 
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ED 
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Iqun 
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1 lmkp | 
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PDB annotation 


ASSOCIATED 35 KDA 
PROTEIN, P35A, THREE 
HELK BUNDLE 


ENDOCYTOSIS/EXOCYTOSI 
S SYNAPTOTAGMIN 
ASSOCIATED 35 KDA 
PROTEIN, P35A, THREE 
HELIX BUNDLE , 


CONTRACTILE PROTEIN 
TRIPLE-HELIX COILED 
COIL, CONTRACTILE 
PROTEIN 


TRANSCRIPTION 
REGULATION SIGMA70; 
RNA POLYMERASE SIGMA 
FACTOR, TRANSCRIPTION 
REGULATION I 








METAL TRANSPORT U 
INHEBITOR/RECEPTOR HFE; & 
HFE, HEREDITARY g 
HEMOCHROMATOSIS, MHC m 
CLASS I, TRANSFERRIN 2 
RECEPTOR _ 


HYDROLASE SGAP; f 
DOUBLE-ZINC r 
METALLOPROTEINAZE, H 
CALCIUM ACTIVATION, p 
PROTEIN- 2 INHIBITOR a 


Compound 




SYNTAXIN-1A; CHAIN: A, 
B,C; 


HUMAN SKELETAL 
MUSCLE ALPHA-ACTININ 
2; CHAIN: A; 


RNA POLYMERASE 
PRIMARY SIGMA 
FACTOR; CHAIN: NULL; 




HYDROLASE(AMINOPEPT 
IDASE) AMINOPEPTIDASE 
(AEROMONAS 
PROTEOLYTICA) 
(E.C.3.4.11.10)1AMP3 


| 


IDASE) AMINOPEPTIDASE 
(AEROMONAS 
PROTEOLYTICA) 
(E.C.3.4.11.10)1AMP3 


HEMOCHROMATOSIS 
PROTEIN; CHAIN: A, D, G; 
BETA-2- 

MICROGLOBULIN; CHAIN: 
B,E,H; TRANSFERRIN 
RECEPTOR; CHAIN: C F, I; 


AMINOPEPTIDASE; 
CHAIN: A; 


SEQ FOLD 
score 




















PMF 
score 




oo 

9 


CN 

9 


0.06 




0.43 


0.43 


0.83 


0.64 


Verify 
score 




0.08 


0.02 






0.04 


0.11 


0.02 


-0.00 


Psi 
Blast 




co 

6 

oo 


VO 

6 

vo 






oo 


5.1e-30 


le-46 


1.7e-28 






CS 


i— » 
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oo 
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& 






CO 








START 
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CO 


vo 

CO 

r- 
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CN 


CN 


f-4 

y—i 
CN 
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ID 




< 
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»— 1 


lquu 
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lamp 


lamp 


»— « 




ID 


























00 


00 


00 
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f-4 










r- I 
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• 
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PDB annotation 


I COMPLEX 1 




STRUCTURAL PROTEIN 
RETINAL S-ANTIGEN, 48 KD 
PROTEIN; VISUAL 
ARRESTIN, 

DESENSrnS ATION OF THE 
VISUAL TRANSDUCTION 2 
CASCADE, BINDING TO 
ACTICATED AND 
PHOSPHORYLATED 
RHODOPSIN 


STRUCTURAL PROTEIN 
RETINAL S-ANTIGEN, 48 KD 
PROTEIN; VISUAL 
ARRESTIN, 

DESENSITIS ATION OF THE 
VISUAL TRANSDUCTION 2 
CASCADE, BINDING TO 
ACTICATED AND 
PHOSPHORYLATED 
RHODOPSIN 


STRUCTURAL PROTEIN 
RETINAL S-ANTIGEN, 48 KD 5 
PROTEIN; VISUAL f 
ARRESTIN, 

DESENSITISATIONOFTHE % 
VISUAL TRANSDUCTION 2 , 
CASCADE, BINDING TO J 
ACTICATED AND 1 
PHOSPHORYLATED I 
RHODOPSIN I 


P 


PROTEASEPROSOME, j 
MULTICATALYTIC 8 ' 
PROTEASE, MCP, J 
MACROPA3N; PROTEASE, 1 
PROTEASOME, HYDROLASE | 


MULTICATALYTIC fl 


Compound 






ARRESTIN; CHAIN: A, B.C 
D; 


ARRESTIN; CHAIN: A, B, C, 


a 


ARRESTIN; CHAIN: A, B, C, 




PROTEASOME; CHAIN: A, 
B f C,D,B,F,G,H,I,J,K,L, 
M, N, O.P,Q, 


20S PROTEASOME; 


SEQFOLD 
score 






73.18 




71.95 • 




71.75 


I 55.61 




score 








0.00 










Verify 
score 








cn 

9 










Psi 
Blast 






1.2e-41 


»-» 


CO 
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i 

CM 


1 








OO 

m 








o 

CM 




START 
AA 








3. 


l-H 




CM 
•— « 




s 


e 








< 


0 






w 


la 






Icfl 


[ lcfl 

• 

i 
i 
i 


lcfl 




lpma 


s 

1— 1 


SEQID 
NO: 
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CM 


CM 
—1 


CM 
~H 




CM 
CM 


CM 
CM 
*— < 
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PDB annotation 


PROTEINASE 
MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING, HYDROLASE, 
PROTEASE 


j 


PROTEINASE 
MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING, HYDROLASE, 
PROTEASE 


MULTICATALYTIC 


MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING, HYDROLASE, 
PROTEASE 


MULTICATALYTIC '1 
PROTEINASE f 
MULTICATALYTIC 
PROTEINASE, 20S f 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN £ 
PROCESSING, HYDROLASE, | 
PROTEASE f 




RIBOSOMAL PROTEIN I, 
RIBOSOMAL PROTEIN, 
RRNA-BINDING * 


dta I'm*. Uv 

iis 

i 

o o 5 

sa£ 


1 

I 

1 


Compound 


go ■ 

6 h-T 

S >— r 


C 

i 


'3 d 

35 

\p 


20S PROTEASOME; 




20S PROTEASOME; 

CHAIN:A,B > C,D,E,F,G, 

H,I,J,K,L,M,N,0,P,Q, 




RIBOSOMAL PROTEIN L9; 




RIBOSOMAL PROTEIN L9; 
CHAIN: NULL; 




SEQFOLD 
score 




84.34 


58.38 


52.75 






54.38 
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PDB annotation 


1 HALOPEROXIDASE | 


BROMOPEROXIDASE L, 
HALOPEROXIDASE L; 
HALOPEROXIDASE, 
OXIDOREDUCTASE 


HALOPEROXIDASE j 
CHLOROPEROXIDASE Al, 1 
HALOPEROXIDASE Al; fl 
HALOPEROXIDASE, 7 
OXIDOREDUCTASE I 


HALOPEROXIDASE 
HALOPEROXIDASE F; 
HALOPEROXIDASE, 
OXIDOREDUCTASE, 
PROPIONATE COMPLEX 


AMINOPEPTIDASE 
AMINOPEPTIDASE, 
PROLINE IMINOPEPTIDASE, 
SERINE PROTEASE, 2 
XANTHOMONAS 
CAMPESTRIS 


HYDROLASE HYDROLASE, 
HALOALKANE 

DEHALOGENASE. 1 


-in 
w 

i 

1 


HALOPEROXIDASE 


CHLOROPEROXIDASE A2; J 
HALOPEROXIDASE, i 


OXIDOREDUCTASE, (J 
PEROXIDASE, ALPHA/BETA £ 
2 HYDROLASE FOLD, £ 
MUTANT M99T .5. 


HYDROLASE BPHD; 
HYDROLASE, PCB £ 
DEGRADATION r 


HYDROLASE A/B |i 
HYDROLASE FOLD, p 
DEHALOGENASE I-S BOND p 


Compound 


| CHLOROPEROXIDASE L; 1 


CHAIN:A,B,C; 


BROMOPEROXIDASE Al; 
CHAIN: NULL; 


i 

i 

o c 




PROLINE 

IMINOPEPTIDASE; CHAIN: 
A,B; 


1 HALOALKANE ! 


1 
o 


NULL: 




I 

ii 

m C 




8*3 1 

ih 

ill 


HALOALKANE 
DEHALOGENASE; 1- 
CHLOROHEXANE CHAIN: 
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68.50 


63.30 
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PDB annotation 




PROTEIN KINASE CDK2; 
PROTEIN KINASE, CELL 
CYCLE, 

PHOSPHORYLATION, 
STAUROSPORINE, 2 CELL 
DIVISION, MITOSIS, 
INHIBITION J 


COMPLEX 1 
(KINASE/INHIBITOR) CDK6; 
P19INK4D; CYCUN 
DEPENDENT KINASE, 
CYCUN DEPENDENT 
KINASE INHIBITORY 2 
PROTEIN,-CDK, INK4, CELL 
CYCLE, COMPLEX 
(KINASE/INHIBITOR) 
HEADER HELIX 


COMPLEX (INHIBITOR 
PROTEIN/KINASE) 
INHIBITOR PROTEIN, 
CYCLIN-DEPENDENT 
KINASE, CELL CYCLE 2 
CONTROL, ALPHA/BETA, 1 j 
COMPLEX (INHIBITOR f J 
PROTEIN/KINASE) 1 | 






Compound 


MEGA-8 1APM 6 | 


CYCLIN-DEPENDENT 
PROTEIN KINASE 2; 


CHAIN: NULL; 


111 

w 


P 

11 


CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A; 
P19INK4D; CHAIN: B; 


PHOSPHOTRANSFERASE . 
CAMP-DEPENDENT . 
PROTEIN KINASE 
CATALYTIC SUBUNIT 
1CMK3(E.C.2.7.1.37) 
ICMK 4 


TRANSFERASE(PHOSPHO 
TRANSFERASE) CAMP- 
DEPENDENT PROTEIN 
KINASE (E.C.2.7.L37) 
(CAPK)1CTP3 
(CATALYTIC SUBUNTT) 
ICTP 4 


SEQFOLD 
score 




160.34 • 


148.04 

i 
1 


163.82 


116.16 

t 


112.97 
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SEQID 
NO: 
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PDB annotation 


SH3, 2 PHOSPHOTYROSINE 

PROTO-ONCOGENE, 

PHOSPHOTRANSFERASE 


TYROSINE KINASE 
TYROSINE KINASE- 
INHIBITOR COMPLEX, 
DOWN-REGULATED 
KINASE, 2 ORDERED 
ACTIVATION LOOP 




COMPLEX (ZINC 
FINGER/DNA) ZINC FINGEI 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 




ENDONUCLEASE 
ENDONUQUEASE, TRNA 
ENDONUCLEASE 




KINASE KINASE, SIGNAL 

TRANSDUCTION, 

CALCIUM/CALMODULIN 






Compound 




1 HAEMATOPOETIC CELL 
1 KINASE (HCK); CHAIN: A; 




1 DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 




TRNA ENDONUCLEASE; 
CHAIN: A, B, C,D; 




! CALCIUM/CALMODULIN- 
DEPENDENT PROTEIN 
i KINASE; CHAIN: NULL; 


TRANSFERASE(PHOSPHO 
TRANSFERASE) $C-/AMP$- 
DEPENDENT PROTEIN 
KINASE (E.C.2.7.1.37) 
($C/APK$) 1APM3 
(CATALYTIC SUBUNTT) 
ALPHA ISOENZYME 
MUTANT WITH SER 139 


1APM 4 REPLACED BY 
ALA (/SI39A$) COMPLEX 4 
WITH THE PEPTIDE 1 APM 
5 INHIBITOR PKI(5-24) 
AND THE DETERGENT 
MEGA-8 1APM6 


SEQ FOLD 
score 




74.45 
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119.44 
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PDB annotation 


1 PHOSPHORYLATION I 


ACTIN-BINDING PROTEIN 
ACTIN-BINDING PROTEIN, 
CALCIUM-BINDING, 
PHOSPHORYLATION 


ACTIN-BINDING CALPONIN 1 
HOMOLOGY (CH) DOMAIN; 
FILAMENTOUS ACTIN- J 
BINDING DOMAIN, f 
CYTOSKELETON 


STRUCTURAL PROTEIN 
DYSTROPHIN, MUSCULAR 
DYSTROPHY, CALPONIN 
HOMOLOGY DOMAIN, 2 
ACTIN-BINDING, UTROPHIN 


STRUCTURAL PROTEIN 
CALPONIN HOMOLOGY 
DOMAIN, DOMAIN 
SWAPPING, ACTIN 
BINDING, 2 UTROPHIN, 
DYSTROPHIN, 
STRUCTURAL PROTEIN 


| — rl 

■ 


EXTRACELLULAR MODULE l J 
OSTEONECTIN, SPARC, f ] 
SECRETED PROTEIN ACIDIC' « 
AND EXTRACELLULAR J 1 
MODULE, GLYCOPROTEIN, '} 
ANTI-ADHESIVE PROTEIN, 2 V 
COLLAGEN BINDING, SITE- | ] 
DIRECTED MUTAGENESIS, £ ] 
GLYCOSYLATED 3 PROTEIN? 1 
MODRES 




co Q 

fe S 5 

lis 
11 


MOLECULAR CHAPERONE fj 
HDJ-1; MOLECULAR p] 


Compound 






i 


1 SPECTRIN BETA CHAIN: 1 


CHAIN: A; 


DYSTROPHIN; CHAIN: A, 
B,C,D; 


UTROPHIN ACTIN 
BINDING REGION; CHAIN: 
A,B; 




BASEMENT MEMBRANE 
PROTEIN BM-40; CHAIN: 
A,B; 




DNAJ; CHAIN: NULL; 


HUMAN HSP40; CHAIN: 
NULL; 


SEQFOLD 
score 
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score 
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PDB annotation 


TURN-HELIX, DNA 2 
BINDINaPROTEIN 




1 COMPLEX (ZINC 

i FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), Ztf 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), TXb 
FINGER, DNA-BINDING 
PROTEIN 

■ 


' COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), 2St 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), Ztt 
FINGER, DNA-BINDING 
PROTEIN 






Compound 






QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 

C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
j DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


J QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 

IC; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


TRANSCRIPTION 
REGULATION YEAST 
TRANSCRIPTION FACTOR 
ADRl (RESIDUES 102 - 130) 
1 ARD 3 (AMINO 
TERMINAL ZINC FINGER 
DOMAIN) (NMR, 10 
STRUCTURES) 1ARD4 
(ADR1B) 1 ARD 5 


TRANSCRIPTION 
REGULATION YEAST 
TRANSCRIPTION FACTOR 
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PDB annotation 


REGULATION/DNA) TFIHA; 

5S GENE; NMR, TFIHA, 

PROTEIN, DNA, 

TRANSCRIPTION FACTOR, 

5S RNA 2 GENE, DNA 

BINDING PROTEIN, ZINC 

FINGER, COMPLEX 3 
1 (TRANSCRIPTION 1 
! REGULATION/DNA) 1 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) TFIHA; 
5S GENE; NMR, TFIIIA, 
PROTEIN, DNA, 
TRANSCRIPTION FACTOR, 
5S RNA 2 GENE, DNA 
BINDING PROTEIN, ZINC 
FINGER, COMPLEX 3 
(TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE HI, 2 1| 
TRANSCRIPTION fi 
INITIATION, ZINC FINGER I 
PROTEIN T ! 


COMPLEX (TRANSCRIPTIONJj 
REGULATION/DNA) 1 
COMPLEX (TRANSCRIPTIONO ! 
REGULATION/DNA), RNA gj 
POLYMERASE IH, 2 fij 
TRANSCRIPTION 
INITIATION, ZINC FINGER ; 
PROTEIN 


COMPLEX (TRANSCRIPTION^ - 
REGULATION/DNA) YING- fll 
YANG 1; TRANSCRIPTION fi| 
INITIATION, INITIATOR p | 


Compound 


1 

CO 

B o 


! TRANSCRIPTION FACTOR 
j IHA; CHAIN: A; 5S RNA 
GENE; CHAIN: E, F; 


TFIIIA; CHAIN: A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B, C,E,F; 


TFIHA; CHAIN: A, D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B, C,E,F; 

• 

i 
i 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


SEQFOLD 
score 
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PDB annotation 


| STRUCTURE 1 




LIPID BINDING PROTEIN 
APO-E3; LIPID TRANSPORT, 
LIPID TRANSPORT, 
HEPARIN-BINDING, 
PLASMA 2 PROTEIN, HDL, 
VLDL REMARK j 


COMPLEX (HSP24/HSP70) ' 1 
HSP70, GRPE, MOLECULAR 
CHAPERONE, NUCLEOTIDE 
EXCHANGE 2 FACTOR, 
COILED-COIL, COMPLEX 
(HSP24/HSP70) 


ENDOCYTOSIS/EXOCYTOSI 
S NSECl; PROTEIN-PROTEIN 
COMPLEX, MULTI-SUBUNIT 


ENDOCYTOSIS/EXOCYTOSI 
S SYNAPTOTAGMIN 
ASSOCIATED 35 KDA 
PROTEIN, P35A, THREE 
HELEX BUNDLE 


ZINC-BINDING PROTEIN 
ZINC-BINDING PROTEIN, 1] 
XNF7, BBOX, f J 
DEVELOPMENT, 3 MID- j 
BLASTULA-TRANSITION v 1 


CONTRACTILE PROTEIN 1 
TRIPLE-HELIX COILED 1 
COIL, CONTRACTILE U 
PROTEIN S 


CONTRACTILE PROTEIN p ; 
TRIPLE-HELIX COILED 
COIL, CONTRACTILE * j 
PROTEIN ?: 


in 


LIPID TRANSPORT APO A-I; i ? 1 
LIPOPROTEIN, LIPID fl J 
TRANSPORT, fil 


Compound 






APOLIPOPROTEINE; 
CHAIN: A; 


NUCLEOTIDE EXCHANGE 
FACTOR GRPE; CHAIN: A, 
B; MOLECULAR 
CHAPERONEDNAK; 
CHAIN: D; 


SYNTAXIN BINDING 
PROTEIN 1; CHAIN: A; 
SYNTAXIN 1A; CHAIN: B; 


SYNTAXIN- 1 A; CHAIN: A, 
B,C; 


NUCLEAR FACTOR XNF7; 
CHAIN: NULL; 


HUMAN SKELETAL 
MUSCLE ALPHA-ACTININ 
2; CHAIN: A; 


HUMAN SKELETAL 
MUSCLE ALPHA-ACTININ 
2; CHAIN: A; 




APOLEPOPROTEINA-I; 
CHAIN: A, B, C,D; 1 


SEQFOLD 
score 


















59.40 




64.70 


PMF 
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0.18 


0.00 
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0.51 








Verify 
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PDB annotation 


PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC | 
FINGER/DNA) ZLNC FINGER, 
PROTEIN-DNA 4 
INTERACTION, PROTEIN { 1 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC • 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL y. 
STRUCTURE, COMPLEX * 1 
(ZINC FINGER/DNA) * : 


| COMPLEX (ZINC 

1 FINGER/DNA) ZINC FINGER, ' 
PROTEIN-DNA (j 
INTERACTION, PROTEIN { 
DESIGN, 2 CRYSTAL fj 
STRUCTURE, COMPLEX * i 

1 (ZINC FINGER/DNA) V 


COMPLEX (ZENC -1 
FINGER/DNA) ZINC FINGER, 5 * 
PROTEIN-DNA M 
INTERACTION, PROTEIN fi 
DESIGN, 2 CRYSTAL h J 
STRUCTURE, COMPLEX f \ 


Compound 


PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN; A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQFOLD 
score 














PMF 
score 
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1.00 


1.00 


1.00 . 


1.00 


Verify 
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PDB annotation 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (TRANSCRIPTION f 
REGULATION/DNA) TFIIIA; | 
5S GENE; NMR, TFIIIA, 
PROTEIN, DNA, 
TRANSCRIPTION FACTOR, 
5S RNA 2 GENE, DNA 
BINDING PROTEIN, ZINC 
FINGER, COMPLEX 3 
(TRANSCRIPTION 
REGULATTON/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) TFIIIA; 
5S GENE; NMR, TFIIIA, 
PROTEIN, DNA, 
TRANSCRIPTION FACTOR, 
5S RNA 2 GENE, DNA 
BINDING PROTEIN, ZINC i 1 
FINGER, COMPLEX 3 
(TRANSCRIPTION « \ 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION^ 
REGULATION/DNA) -J 
COMPLEX (TRANSCRIPTION^ ) 
REGULATION/DNA), RNA I j 
POLYMERASE HI, 2 t\ 
TRANSCRIPTION \ * 
INITIATION, ZINC FINGER ' : 
PROTEIN 5 J 


COMPLEX (TRANSCRIPTIONH^ 
REGULATION/DNA) fjj 
COMPLEX (TRANSOEUPnONrij 
REGULATION/DNA), RNA y] 


Compound 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


TRANSCRIPTION FACTOR 
EtlA; CHAIN: A; 5S RNA 
GENE; CHAIN: E,F; 


TRANSCRIPTION FACTOR 
mA; CHAIN: A; 5S RNA 
GENE; CHAIN: E,F; 


TFIIIA; CHAIN: A, D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


if 


SEQ FOLD 
score 
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PDB annotation 




COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE1 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FESfGEl 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX ■ 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGH 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGH 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE] 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


Compound 


co « a O 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 




MUTANT 
1BB0 3R 
ABU (Cll 
STRUCT! 


SEQ FOLD 
score 
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score 
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0.95 


1.00 
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1.00 
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score 
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PDB annotation 


(TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) TFIHA; 
5SGENE; NMR, TFEQA, 
PROTEIN, DNA, 
TRANSCRIPTION FACTOR, 
5S RNA 2 GENE, DNA i 
BINDING PROTEIN, ZINC \ 
FINGER, COMPLEX 3 
(TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


I COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA j 
POLYMERASE m, 2 f 
TRANSCRIPTION * 
INITIATION, ZINC FINGER f 
PROTEIN 


COMPLEX (TRANSOUPTIONl 
REGULATION/DNA) {£ 
COMPLEX (TRANSCRIPTIOIfc 
REGULATION/DNA), RNA « 
POLYMERASE m, 2 V 
TRANSCRIPTION * 
INITIATION, ZINC FINGER G 
PROTEIN N 


COMPLEX CTRANSCRIPTIOlfi 
REGULATION/DNA) YING- p 
YANG 1; TRANSCRIPTION «T 


Compound 




TRANSCRIPTION FACTOR 
mA; CHAIN: A; 5S RNA 
GENB; CHAIN: E, F; 


TFIIIA; CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B, C,E,F; 


S?f 

Q 3 

f 

1 i 


CHAIN: B,C,E,F; 


CO 

in 
Q 
< 


RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 


SEQ FOLD 
i score 








115.39 
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0.03 


1.00 




1.00 


1.00 


! Verify 
I score 
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PDB annotation 


DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 




COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


8 


Compound 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A,B,D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 

i 
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1 


SEQFOLD 
score 
















PMF 
score 




8 

»— < 


o 
o 


o 
o 


8 

1-4 


s. 




Verify 
score 




»— ( 

d 


o 


S3 

d 


1 


§ 


CM 

d 


Psi 
Blast 




1 


! 

w-> 

00 


o 


•s 

wo 


CO 


3 

oo 

vd 






CO 

C- 
en 


T— 1 




§ 


1. 


CO 
*-H 

wo 


START 
AA 




CM 
ON 
CM 


CO 


oo 

a 


CO 


i 


CM 


CHAIN 
ID 






a 




U 




o 




U 




u 


U 






Imey 


Imey j 


1 


Imey 


Imey 


Imey | 


SEQID 
NO: 




00 
CM 


00 
CM 




oo 
CM 


oo 

CM 





383 



WO 02/081731 



PCT/US02/01222 



TOOK/ 
I 




,e.i 



y 



-Si 
II 

ess 



S3 

CI 

is 

55 



e 

Q 

u 



fa g 



fa 2 



Ed 



5 



§5 

IT 



la 



a 




On 
OS 
O 



a 

9_ 

VO 

I 



On 



o 

VO 




UWpj 

s 8£ 




8 



u 




-go 

ill 



£|5 

§8S 



8 



oo 
cs 



384 



WO 02/081731 PCT7US02/01222 




385 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA . 
POLYMERASE HI, 2 J| 
TRANSCRIPTION ^ 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


! COMPLEX (TRANSCRIPTION 
, REGULATION/DNA), RNA 
POLYMERASE m, 2 
! TRANSCRIPTION 
• INITIATION, ZINC FINGER 
1 PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION m 
INITIATION, ZINC FINGER R 
PROTEIN * \ 


COMPLEX (TRANSOUPTIO^jJ 
REGULATION/DNA) M 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA (ft 
POLYMERASE HI, 2 Q 
TRANSCRIPTION ffl 
INITIATION, ZINC FINGER v 
PROTEIN— « 


COMPLEX (TRANSCRIPTION* 
REGULATION/DNA) f* 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA flj 
POLYMERASE 01, 2 


Compound 




TFHIA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B.CE.F; 


TFHIA; CHAIN: A, D; 5S 
| RIBOSOMAL RNA GENE; 
CHAIN: B, C,E,F; 


TFHIA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B, C,E,F; 


TFEDLA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


CO g 

III 


SEQ FOLD 
score 














PMF 
score 




0.96 


LOO 


0.90 


0.57 


0.43 


Verify 
score 




S 


o 
9 


0.03 


-0.35 


-0.22 


Psi 
Blast 




CO 


r** 

00 


00 
CO 


le-33 


i— • 

i-H 






cs 

00 
CO 






oo 


CO 


START 
AA 




r-* 

CO 
CN 










CHAIN 
ID 




< 


< 


< 


< 


< 






»— i 


1— 1 


t-H 


ltf6 


ltf6 


SEQID 
NO: 




r** 

00 
<N 


3 









386 



WO 02/081731 



PCT/US02/01222 







1 

», X H 

lilB 



.go 

£ « s« E 








1« 



to 

1 



3 o 
> 8 



*5 



EH 

CO 

5 



04 



s 

8* 



1 , 



8 



387 



WO 02/081731 



PCTYUS02/01222 



1 









CO 



I S 



*55 



a 



a 



Sal 



388 



WO 02/081731 



PCTYUS02/01222 






111 1 

USEE m 



i 



o 



1 

! 





_ U <; ^ 
-Or > 





ST 



s 



•g a 



§3 
8* 



a 



CO 



U 



U 



CM 



■a 



389 



WO 02/081731 



PCT/US02/01222 




390 



WO 02/081731 



PCT/US02/01222 








SOS/ 




O 

a 

o 
U 





D i 




p U p U rrt 





hWQO 



s 



8 
9 



* pa 



CO 



CO 



oo 

CO 
<N 



8 



391 



WO 02/081731 



PCT/US02/01222 




WO 02/081731 



PCT/US02/01222 




c§ j£ w 

*£§ 

a o 
o 5 



U 







PQ 






in 



£ i 



CO 



O 

d 



oo 
o 
o 



CO 

o 



8* 



a 



55 



o 

CN 
<L> 



o 

cn 



CM 



3 



s 



VO 



CO 
OO 



VO 
CO 



a 



cn 



s 



8 

CM 



ON 



393 



WO 02/081731 



PCT/US02/01222 




OS 
oo oo 







gSgg 



] 

! 





a m 



3 

o 

Or to 

w 

CO 



o 



3 



O 



O 

o 



^ c3 
fa 5 



e 



VO 
ON 



£2 

00 



<2 

VO 

i> 

<n 



VO 

oo 



CO 



CO 

VO 

I 

in 

00 



oo 



o 

cr. 



OV 



394 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


SHORT-CHAIN I 


I DEHYDROGENASE | 


OXIDOREDUCTASE 
OXIDOREDUCTASE, 
TROPANE ALKALOID 
BIOSYNTHESIS, 
REDUCTION OF 2 4 
TROPINONE TO TROPINE, 1 
SHORT-CHAIN 
DEHYDROGENASE 


OXIDOREDUCTASE 
OXIDOREDUCTASE, 
TROPANE ALKALOID 


BIOSYNTHESIS, 
REDUCTION OF 2 
TROPINONE TO TROPINE, 
SHORT-CHAIN 
DEHYDROGENASE 


OXIDOREDUCTASE NAD- 
DEPENDENT 


5 8 

S«w 


OXIDOREDUCTASE NAD- * 
DEPENDENT H 
OXIDOREDUCTASE, SH0RT4 , 


CHAIN ALCOHOL 2 % 1 
DEHYDROGENASE, PCB £ 
DEGRADATION m 


5 a 

II 
si 

w 5 bj 

111 
III 


OXIDOREDUCTASE SHORT- % 
CHAIN DEHYDROGENASE, O 
OXIDOREDUCTASE U 


i 

i 

s 

c 


?S3 


Compound 




TROPINONE REDUCTASE- 
I; CHAIN: A, B; 


TROPINONE REDUCTASE- 
I; CHAIN: A,B; 


co ^ E9 

ii|J 




ill 
||| 

Boo 


CHAIN: NULL; 


CARBONYL REDUCTASE; 
CHAIN: A, B, CD; 


1 CARBONYL REDUCTASE; 1 


CHAIN: A, B, C f D; 


ENOYLrACYL CARRIER 
PROTEIN (ACP) 
REDUCTASE; 1ENY4 


SEQFOLD 
score 






89.11 


78.30 






84.06 


51,90 




score 




1.00 






1.00 


1.00 






Verify 
score 




0.16 






9 

O 


0.17 






Psi 
Blast 




1.7e-65 


in 
vo 


oo 


1.7e-48 


6 
*>. 

T— I 


CM 

vo 

IN 
< 


ov 

1-4 

6 
oo 

vd 


fi < 




VO 
OO 
CM 


OV 


oo 
o 

CO 


Si 

oo 
cs 


oo 
cm 


oo 

CM 


CO 

On 

CM 


| START 
AA 




CN 
CO 




CO 
CO 


CO 


CO 
CO 


CO 
CO 


CO 




» 




PQ 


« 






< 


< 








lael 


lael • 


Ibdb 


Ibdb 


Icyd 


Icyd 


leny 


SEQID 
NO: 




Ov 
<N 




ON 
CM 




CM 


Ov 
CM 


Ov 
CM 
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PDB annotation 


UA1L><JKEL> UL 1 AOH 

SEPIAPTERIN REDUCTASE, 

TETRAHYDROBIOPTERIN, 

OXIDOREDUCTASE 


OXIDOREDUCTASE 
NAPHTHOL REDUCTASE; J 
OXIDOREDUCTASE fl 


OXIDOREDUCTASE 
NAPHTHOL REDUCTASE; 


§ 

O 
0 


II 

o o 
§§ 


TROPANE ALKALOID 
BIOSYNTHESIS, 
REPUCTION OF 2 
TROPINONE TO 
PSEUDOTROPINE, SHORT- 
CHAIN DEHYDROGENASE 


OXIDOREDUCTASE 
OXIDOREDUCTASE, 
TROPANE ALKALOID 


BIOSYNTHESIS, 
REDUCTION OF 2 
TROPINONE TO «=p 
PSEUDOTROPINE, SHORT- £ 
CHAIN DEHYDROGENASE ' \ 


1 




1 CELL ADHESION PROTEIN J* 1 


A-DOMAIN INTEGRIN, CEL& 
ADHESION PROTEIN, H 
GLYCOPROTEIN, fp 

ThYTR APRT J ITT AP 9. mi 


Si 

| 


Compound 


REDUCTASE; CHAIN: 
NULL; 


TRIHYDROXYNAPHTHAL 
ENE REDUCTASE; CHAIN: 
A, B; 


TRIHYDROXYNAPHTHAL 
ENE REDUCTASE; CHAIN: 
A, B; 


TROPINONE REDUCTASE- 
E; CHAIN: A, B; 


TROPINONE REDUCTASE- 
H; CHAIN: A, B; 


OXIDOREDUCTASE 
(FLAVOENZYME) 
GLUTATHIONE 
REDUCTASE (B.C. 1.6.4.2), 
OXIDIZED FORM (E) 3GRS 
4 




1 INTEGRIN; CHAIN: NULL; 1 




SEQFOLD 
score 




78.32 




95.13 










PMF 
score 






1.00 




1.00 


0.07 




0.95 


Verify 
score 


N 




0.58 




0.25 


-0.75 




0.16 


Psi 
Blast 


N 

§ 

-4 


© 

<£> 
ft 


! 

1—4 


CO 


4 

CO 


0.0085 




1 

i-4 


Mi 




s 

rM 




o 

CO 


vo 

3 






vo 

CO 

•—i 


START 
AA 


0 

o 


m 

M 


rf 
co 


1— < 
CO 




r-i 




CN 


CHAIN 
ID 




< 


< 


<: 


< 










1 


> 
— i 


lybv 


2ae2 


t 


§> 

CO 




lido 


SEQID 
NO: 


t 

in < 
M < 




ON 
CN 


ON 
CM 


ON 

CM 


Tt 
ON 
CM 




On 
CM 
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PDB annotation 


CELL ADHESION LFA-l, 
ALPHA-L\,BETA-2 

INTEGRIN, A-DOMAIN; ILFA 

1 8 




, COMPLEX (TRANSCRIPTION 1 
! FACTOR/DNA) fU 

TRANSCRIPTION FACTOR, ^' 
| PROTEIN-DN A COMPLEX, 

CYTOKINE 2 ACTIVATION, 

COMPLEX (TRANSCRIPTION 

FACTOR/DNA) 




DNA-BINDING HMGA DNA- 
BINDING HMG-BOX 
DOMAIN A OF RAT HMGl; 
lAAB 8 HMG-BOX 1 AAB 20 


DNA-BINDING HMGA DNA- 
BINDING HMG-BOX 
DOMAIN A OF RAT HMGl; 
lAAB 8 HMG-BOX lAAB 20 


DNA BINDING PROTEIN 
HMG BOX, DNA BENDING, 
DNA RECOGNITION, *g 
CHROMATIN, NMR, DNA 2 
BINDING PROTEIN * ? 


DNA BINDING PROTEIN ^ 
HMG BOX, DNA BENDING/'l 
DNA RECOGNITION, & 
CHROMATIN, NMR, DNA 2 (f) 
BINDING PROTEIN 'fn 


Qg o 5 § 


Compound 


CDll A; ILFA 5 CHAIN: A, 
B; ILFA 6 




STAT3B;CHAIN:A;18- 
MER 

DESOXYOLIGONUCLEOTI 
DE; CHAIN: B; 




HIGH MOBILITY GROUP 
PROTEIN; 1 AAB 5 CHAIN: 
NULL; lAAB 6 


HIGH MOBIUrY GROUP 
PROTEIN; 1 AAB 5 CHAIN: 
NULL; lAAB 6 


NON HISTONE PROTEIN 6 
A; CHAIN: A; 


NON HISTONE PROTEIN 6 
A; CHAIN: A; 


HIGH MOBILITY GROUP 1 
PROTEIN; CHAIN: A; DNA 
(5'-D(*CP*CP*(IDO) 
CHAIN: B; DNA (5'- CHAIN: 
C; 


SEQ FOLD 
score 














54.43 






PMF 
score 


0.07 




0.01 




1.00 


0.99 




1.00 

i 


0.99 


Verify 
score 


0.10 




0.02 




0.32 


0.67 




0.52 


0.63 


Psi 
Blast 


\o 

i 




CO 




i—l 

6 

00 

XT 


? 

0) 

«n 

00 


6.8e-15 


r-t 
CO 

vd 


0.00014 








ON 
CO 
CO 




i 


00 
CN 


CO 


ON 
ON 
CN 


00 


START 
AA 


CN 




CO 




ON 
CO 
CN 


ON 
XT 
CN 






CN 


CHAIN 
ID 


< 




< 








< . 




< 




Ufa 




JO 




laab 


laab 


Icg7 


lcg7 


lckt i 


SEQ ID 
NO: 


m 
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ON 
CN 
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1 


o 
o 
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CO 
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PDB annotation 


— i^r 




COMPLEX (DNA-BINDING 
PROTEIN/DNA) 


1PCT 


Mil 




SgSgg 

galss 

1 g « 1 g 


Compound 


DNA-BINDING HIGH 
MOBILITY GROUP 
PROTEIN FRAGMENT-B 
(HMGB) (DNA-BINDING 
IHME 3 HMG-BOX 
DOMAIN B OF RAT HMGl) 


(NMR, 1 STRUCTURE) 
IHME 4 


DNA-BINDINGHIGH 
MOBILITY GROUP 
PROTEIN FRAGMENT-B 
(HMGB) (DNA-BINDING 
IHME 3 HMG-BOX 
DOMAIN B OF RAT HMGl) 
(NMR, I STRUCTURE) 
IHME 4 


HUMAN SRY; IHRY 6 
CHAIN: A; 1HRY7DNA; 
IHRY 9 CHAIN: B; IHRY 10 


1 DNA-BINDINGHIGH 
MOBILITY GROUP 
PROTEIN 1 (HMGl) BOX 2, 
COMPLEXED WITH IHSM 
3 MERCAPTOETHANOL 
(NMR, MINIMIZED 
AVERAGE STRUCTURE) 
IHSM 4 


# 

■•WW- 




FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 


N 

p 

ill 


SEQFOLD 


I 
















rv. 
S 

E 


2 
i 


00 
00 

d 


d 


NO 
»— 1 

d 


d 


d 




i— i 
d 


Verify 




co 
d 


vo 

CO 

d 


00 
O 

9 


o 
d 


i-H 

d 




8 
9 


Psi 
Blast 


m 
»— i 

CN 


le-09 


! 

»—5 


o 

i 


1.7e-05 




I 


s < 


CN 

o\ 
cn 


s 


o\ 

C\ 
CN 


ON 
« 

CO 


00 
ON 
CN 




ON 
On 

NO 


START 
AA 


o 

a 


8 






s 




CO 

so 


CHAIN 
ID 






< 




<: 




O 


pa ^ 

g9 


1 

i-H 


1 




S 

4= 
?— t 






S> 


SEQID 
NO: 


o 
o 
co 


o 
o 

CO 


8 

co 


o 
o 

CO 


8 

CO 




»— 1 

o 

CO 
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PDB annotation 


SUBGROUP WITHIN IG-LIKE 
DOMAINS, B-TREFOIL FOLD 






LIGASE CBL, UBCH7, ZAP- ^ 
70,E2,UBIQUITIN,E3, 
PHOSPHORYLATION, 2 
TYROSINE KINASE, 
UBIQUITINATION, PROTEIN 
DEGRADATION, 


1 METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATl; RING FINGER 
(C3HC4) 


DNA-BINDING PROTEIN 
V(D) J RECOMBINATION 
ACTIVATING PROTEIN 1; 
RAG1,V(D)I 
RECOMBINATION, 
ANTIBODY, MAD, RING m 
FINGER, 2 ZINC BINUCLEAE 
CLUSTER, ZINC FINGER, fl 
DNA-BINDING PROTEIN *4 


W 


DNA INTEGRATION DNA £ 
INTEGRATION, AIDS, m 
POLYPROTEIN, g 
HYDROLASES 
ENDONUCLEASE, lU 
POLYNUCLEOTIDYL 
TRANSFERASE. DNA Q 
BINDING 3 (VIRAL) U 


DNA INTEGRATION DNA fn 
INTEGRATION, AIDS, I * 
POLYPROTEIN, Ls 


Compound 






VIRUS EQUINE HERPES 
VIRUS-1 (C3HC4, OR RING 
DOMAIN) ICHC 3 (NMR, I 
STRUCTURE) ICHC 4 


i SIGNAL TRANSDUCTION 
PROTEIN CBL; CHAIN: A; 
ZAP-70 PEPTIDE; CHAIN: 
B;UBIQUITIN- 
CONJUGATING ENZYME 
E12-18KDA UBCH7; 
CHAIN: C; 


CDK-ACTCVATING 
KINASE ASSEMBLY 
FACTOR MATl; CHAIN: A; 


RAGl; CHAIN: NULL; 




HIV-1 INTEGRASE; CHAIN: 
NULL; 

• 


INTEGRASE; CHAIN: A, B, 
C; 


q 
























SEQFOI 
score 
























PMF 
score 






CO 
00 

© 


0.52 


0.27 


0.65 




0.22 


0.06 


Verify 
score 






0.02 


-0.34 


0.30 


8 

d 




-0.90 


-0.63 


Psi 
Blast 
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*— i 
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h 
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6 
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oo 

NO 


le-13 
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oo 




ON 
OO 
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ON 
CO 






















P; 


<n 

CO 


ON 
CO 












CHAIN 
ID 








< 


< 








O 


ga 






Ichc 


lfbv 




lrmd 




lbhl 


CO 

*3 

1— » 


SEQID 
NO: 
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PDB annotation 


METHYLATION, 
ALKYLATION, 3 
PHOSPHORYLATION, 
CONTRACTILE PROTEIN 


MUSCLE PROTEIN MUSCLE 1 
PROTEIN, MYOSIN 1 
SUBFRAGMENT-1 , MYOSIN «| 
HEAD, 2 MOTOR PROTEIN T 


MUSCLE PROTEIN MUSCLE 
PROTEIN, MYOSIN 
SUBFRAGMENT-1, MYOSIN 
HEAD, 2 MOTOR PROTEIN 




COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC Ljl 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC J§ 
FINGER, DNA-BINDING G 
PROTEIN U) 
• ..... E 


^ 


COMPLEX (ZINC pi 
FINGER/DNA) COMPLEX kj 


Compound 




MYOSIN; CHAIN: A, B, C; 


MYOSIN; CHAIN: A, B, C; 




QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDDSfG SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 


< II 

i §e 

l|s| 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 


SEQFOLD 
score 






319.94 




87.27 










PMF 
score 




1.00 








0.29 


0.86 


0.25 


0.25 


Verify 
score 




0.09 








3 


0.46 


0.05 


0.29 
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i-H 
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<: 
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PDB annotation 


EXCHANGE FACTOR 
GUANINE NUCLEOTIDE 
EXCHANGE FACTOR, G- 
PROTEIN, TRANSLATION 2 
ELONGATION 


GUANINE NUCLEOTIDE J 
EXCHANGE FACTOR §1 
GUANINENUCLEOTIDE ^ 
EXCHANGE FXcTOR, G- 
PROTEIN, TRANSLATION 2 
ELONGATION 


SIGNAL PROTEIN NUCLEAR 
MATRIX TARGETING 
SIGNAL PROTEIN 


ANKYRIN BINDING MAB; 
ANKYRIN BINDING, 
ATPASE, GLUTATHIONES- 
TRANSFERASE, CARRIER 2 
CRYSTALLIZATION, ION 
TRANSPORT 


TRANSFERASE, BLOOD 
CLOTTING GAMMA CHAIN 
INTEGRIN FRAGMENT, 
CARRIER PROTEIN DRIVEN^ 
2 CRYSTALLIZATION ^ 


rinWfj 

i 

O X 

Ph M 

Si 

li 

H An 


TRANSFERASE GST; yi 
TRANSFERASE, p 
DETOXIHCATION, fy 
GLUTATHIONE V 
TRANSFERASE «J 




Compound 


BETA; CHAIN: NULL; 


ELONGATION FACTOR 1- 
BETA; CHAIN: NULL; 


AML-1B; CHAIN: A; 


FUSION PROTEIN OF 
ALPHA-NA,K-ATPASE 
WITH CHAIN: NULL; 


CHIMERA OF 
GLUTATHIONES- 
TRANSFERASE- 
SYNTHETIC CHAIN: A, B; 


ELONGATION FACTOR 
EEFl A; CHAIN: A; 
ELONGATION FACTOR 
EEFIBA; CHAIN: B; 


GLUTATHIONE 
TRANSFERASE; CHAIN: 
NULL; 


GLUTATHIONE 
TRANSFERASE 
GLUTATHIONES- 
TRANSFERASE 
(E.C.2.5.1.18) FUSED WITH 
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score 
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PDB annotation 


(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX J 
(ZINC FINGER/DNA), ZINC f| 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX ^ 
(ZINC FINGER/DNA), ZINC fj 
FINGER, DNA-BINDING i 
PROTEIN -^jjj 


1|S|| 


COMPLEX (ZINC g 
FINGER/DNA) COMPLEX F 
(ZINC FINGER/DNA), ZINC P4 
FINGER, DNA-BINDING HJ 
PROTEIN fti 




PQ 




« 






PQ 


PQ 


Compound 


DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHADS 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN 

C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHADS 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN 
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PMF 
. score 




CO 

9* 


0.17 


CO 

2 


0.74 


1.00 


1.00 


Verify 
score 




0.03 


-0.26 


8 
9 


s 

9 


d 


0.28 


Psi 
Blast 




o 

CO 


? 


5 

V) 
00 


5 
4 

00 


3 


CO 






o\ 

c5 




s 

CO 


a 

CO 




CO 


START 
AA 




? 






vo 

a 


On 

CS 


s 

CO 


CHAIN 
ID 




< 


< 


< 


< 


< 


< 






lalh 


lalh 


lalh 


lalh 


lalh 


lalh 

i 
i 


SEQID 
NO: 




CO 

»— i 

CO 


CO 
CO 


CO 
1— « 
CO 


CO 
CO 


CO 

»-H 

CO 


CO 
1—1 

CO 



424 



WO 02/081731 



PCTYUS02/01222 




425 



WO 02/081731 



PCT7US02/01222 




y 



N * K Q 

lis 

Ph N Q w 





y^ ^-Ss 




it! 
III 



T3 
O 

I 

<3 




5 o os 

BUS 







lis 



s 8g 



9 



1 



ft. I 



8 



S 



8 



2| 



3 



1 



5 



CO 



5 

8 



a 



a 



a 



CO 



426 



WO 02/081731 



PCT/US02/01222 




427 



WO 02/081731 



PCT/US02/01222 






o 5 




!8§P 



! 







CO 



in 



a s 



s 



o2 3 



3 



03 



la 



a.. 

CO 



Si 

CO 



s 



428 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


I PROTEIN I 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION 4 
INITIATION, Z3NC FINGER 1 
PROTEIN 1 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE HI, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER c 
PROTEIN f 


REGULATION/DNA) YING- * 
YANG 1; TRANSCRIPTION 
INITIATION, INITIATOR J 
ELEMENT, YYI, ZINC 2 1 
FINGER PROTEIN, DNA- { 
PROTEIN RECOGNITION, 3 f 
COMPLEX (TRANSCRIPTION * 
I REGULATION/DNA) _* 


COMPLEX (TRANSCRIPTION • 
REGULATION/DNA) YING- ; 
YANG 1; TRANSCRIPTION I 
INITIATION, INITIATOR f 
ELEMENT, YYI, ZINC 2 f 


Compound 




TFEQA; CHAIN: A, D; 5S 
RIBOSOMALRNA GENE; 
CHAIN: B f C, E, F; 


TFmA; CHAIN: A,D;5S 
RIBOSOMAL RNA GENE: 


CHAIN: B,C, E,F; 


52 - 

3 8 « 
< <*> § 


YYI; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYI; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 
INITIATOR ELEMENT 


< 
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PDB annotation 


1 BINDING PROTEIN/DNA) ! 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) j 


COMPLEX (DNA-BINDING 1 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GU; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GU; GLI, ZINC ' > 
FINGER, COMPLEX (DNA- | : 
BINDING PROTEIN/DNA) 






BLOOD COAGULATION j « 
BLOOD COAGULATION, {. 
EGF, HYDROLASE, SERINE i 
PROTEASE !: 


SURFACE PROTEIN V 
MEROZOITE SURFACE | ) 


! 
I 




ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: C,D; 


ZINC FINGER PROTEIN 
GLI1; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: C,D; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GLI1; CHAIN: A; DNA; 
CHAIN: C,D; 




COAGULATION FACTOR 
EGF-UKE MODULE OF 
BLOOD COAGULATION 
FACTOR X (N-TERMINAL, 
lAPO 3 APO FORM) (NMR, 
13 STRUCTURES) lAPO 4 


FACTOR VH; CHAIN: 
NULL; 


MEROZOITE SURFACE 
PROTEIN 1; CHAIN: A; 
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NK CELL NK CELL, 
RECEPTOR, C-TYPE LECTIN, 
C-TYPE LECTIN-LIKE, NKD 


NK CELL NK CELL, 
RECEPTOR, C-TYPE LECTIN, 
C-TYPE LECTIN-LIKE, NKD 






PDB annotation 


i GLYCOPROTEIN 
GLYCOPROTEIN, 
HYDROLASE, SERINE 
PROTEASE, PLASMA, 
BLOOD 2 COAGULATION 
FACTOR 


1 GROWTH FACTOR TGF- 
ALPHA, H-TGF-ALPHA; EGF- 
LIKE DOMAIN STRUCTURE, 
GROWTH FACTOR 




SUGAR BINDING PROTEIN 
C-TYPE LECTIN, CRD, SP-D, 
COLECTIN, ALPHA- 
HELICAL COILED- 2 COBU 
LUNG SURFACTANT, 
SUGAR BINDING PROTEIN 


COLLAGEN BINDING 
PROTEIN K-BP; DC-BP; 
COAGULATION FACTOR DC- 
BINDING, HETERODIMER, 
VENOM, HABU 2 SNAKE, C- 
T YPE LECTIN 

SUPERFAMILY, COLLAGEN 
BINDING PROTEIN 


.COLLAGEN BINDING 
PROTEIN DC-BP; DC-BP; 
COAGULATION FACTOR DC- 
BINDING, HETERODIMER, 
VENOM, HABU 2 SNAKE, C- 
T YPE LECTIN 

SUPERFAMILY, COLLAGEN 
BINDING PROTEIN 


Compound 


COAGULATION FACTOR 
X; CHAIN: NULL; 


TRANSFORMING 
GROWTH FACTOR 
ALPHA; CHAIN: NULL; 




LUNG SURFACTANT 
PROTEIN D; CHAIN: A, B f 
C; 


CD94; CHAIN: NULL; 


§ 


CO AGULATION FACTOR 
DC-BINDING PROTEIN A; 
CHAIN: A; COAGULATION 
FACTOR DC-BINDING 
PROTEIN B; CHAIN: B; 


CO AGULATION FACTOR 
DC-BINDING PROTEIN A; 
CHAIN: A; COAGULATION 
FACTOR DC-BINDING 
PROTEIN B; CHAIN: B; 


b 
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PDB annotation 


MEMBRANE PROTEIN C- j 
TYPE LECTIN-LIKE 
DOMAINS 


MEMBRANE PROTEIN C- | 
! TYPE LECTIN-LIKE 1 
DOMAINS j| 


HEMATOPOIETIC CELL 
RECEPTOR ACTIVATION 


EA 1, HEMATOPOIETIC 
CELL RECEPTOR, 
LEUCOCYTE, C-TYPE 
LECTIN-LIKE, 2 NKD, KLR 


SUGAR BINDING PROTEIN 
C-TYPE LECTIN, MANNOSE 
RECEPTOR 


COAGULATION FACTOR 
BINDING IX/X-BP 
COAGULATION FACTOR 
BINDING, "C-TYPE LECTIN, 
GLA-DOMAIN 2 BINDING, C- 
TYPE CRD MOTIF, LOOP * 
EXCHANGED DIMER f 


COAGULATION FACTOR • ' 
BINDING IX/X-BP 
COAGULATION FACTOR 
BINDING, C-TYPE LECTIN, * 
GLA-DOMAIN 2 BINDING, C- 1 
TYPE CRD MOTIF, LOOP { 
EXCHANGED DIMER f 


COAGULATION FACTOR 
BINDING IX/X-BP f 
COAGULATION FACTOR \ 
BINDING, C-TYPE LECTIN, * 
GLA-DOMAIN 2 BINDING, C- 1 
TYPE CRD MOTIF, LOOP f 
EXCHANGED DIMER f 


Compound 


FLAVOCETIN-A: ALPHA 
SUBUNTT; CHAIN: A; 
FLAVOCETIN-A: BETA 
SUBUNTT; CHAIN: B 


FLAVOCETIN-A ALPHA 
SUBUNTT; CHAIN: A; 
FLAVOCETIN-A: BETA 
SUBUNTT; CHAIN: B 


EARLY ACTIVATION 
ANTIGEN CD69; CHAIN: A; 


MACROPHAGE MANNOSE 
RECEPTOR; CHAIN: A, B; 


COAGULATION FACTORS 
IX/X-BINDING PROTEIN; 
CHAIN: A, B, C, D,E,F; 


COAGULATION FACTORS 
IX/X-BINDING PROTEIN; 
CHAIN: A, B, C,D,E,F; 


COAGULATION FACTORS 
IX/X-BINDING PROTEIN; 
CHAIN:A,B,C,D,E,F; 


SEQFOLD 
score 
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PDB annotation 


FACTOR/GROWTH FACTOR 
RECEPTOR 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, 
FGFR, IMMUNOGLOBULIN- 
LIKE, SIGNAL . 
TRANSDUCTIONS i 
DIMERIZATION, GROWTH 1 
FACTOR/GROWTH FACTOR 
RECEPTOR 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF, 
FGFR, IMMUNOGLOBULIN" 
LIKE, SIGNAL 
TRANSDUCTION, 2 
DIMERIZATION, GROWTH 
FACTOR/GROWTH FACTOR 
RECEPTOR 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGFl ; 
FGFRl; IMMUNOGLOBULIN 
(IG) LIKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-UKEI ) 
DOMAINS, B -TREFOIL FOLD f ? 


IMMUNE SYSTEM, Jj 
MEMBRANE PROTEIN CD32; 3 
FC RECEPTOR, J! 
IMMUNOGLOULIN, 
I LEUKOCYTE, CD32 ll: 


I CONTRACTILE PROTEIN ( ] 
! IMMUNOGLOBULIN FOLD, fj 
BETA BARREL 


1 MUSCLE PROTEIN p = 
; CONNECTIN, NEXTM5; * ] 
CELL ADHESION, 
GLYCOPROTEIN, flj 
TRANSMEMBRANE, flj 
REPEAT, BRAIN, 2 f { \ 


Compound 




FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: C,D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: C,D; 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: C,D; 


FC RECEPTOR 
FC(GAMMA)RUA; CHAIN: 
A; 


TELOKIN; CHAIN: A 


TITIN; CHAIN: NULL; 


SEQFOLD 
score 
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PDB annotation 


1 IMMUNOGLOBULIN FOLD, 
ALTERNATIVE SPLICING, 
SIGNAL, 3 MUSCLE 
PROTEIN 




GLYCOPROTEIN CD4; 
IMMUNOGLOBULIN FOLD, 
TRANSMEMBRANE. 


GLYCOPROTEIN, T-CELL, 2 
MHC LIPOPROTEIN, 
POLYMORPHISM 


MUSCLE PROTEIN 
IMMUNOGLOBULIN 
SUPERFAMCLY, I SET, 
MUSCLE PROTEIN 


IMMUNE SYSTEM P58 1 


NATURAL KILLER CELL 
RECEPTOR; KIR, NATURAL 
KILLER RECEPTOR, 1 


INHIBITORY RECEPTOR, 2 f 
IMMUNOGLOBULIN 


IMMUNE SYSTEM P58 J 
NATURAL KILLER CELL J 
RECEPTOR; KIR, NATURAL h 
KILLER RECEPTOR, & 
INHIBITORY RECEPTOR, 2 £ 
IMMUNOGLOBULIN fj 


IMMUNE SYSTEM CD32; 
RECEPTOR, FC, CD32, f 
IMMUNESYSTEM ? 


CELL ADHESION PROTEIN r 
NCAM MODULE 2; CELL P 
ADHESION, fl 
GLYCOPROTEIN, HEPARIN- ft 


Compound 




1 


MODULE M5 
(CONNECTIN) ITNM 3 
(NMR, MINIMIZED 
AVERAGE STRUCTURE) 
ITNM 4 ITNM 58 


T-CELL SURFACE 
GLYCOPROTEIN CD4; 
CHAIN: A. B: 




ii 

i— 1 p" 

1 




MHC CLASS I NK CELL 


RECEPTOR PRECURSOR; 
CHAIN: A; 


MHC CLASS I NK CELL 
RECEPTOR PRECURSOR; 
CHAIN: A; 


FC GAMMA RIB; CHAIN: 
A; 


NEURAL CELL ADHESION 
MOLECULE, LARGE 
ISOFORM;CHAIN:A; 


SEQ FOLD | 
score 


















P- 
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0.24 
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0.46 




Blast 




i—i 
i— < 

CO 
CO 


o\ 
i—i 

6 
o\ 
a\ 


i— ( 

<*> 

VO 
VD 


o 

6 

CO 

i-5 


CS 
CS 

6 

NO 
NO 


5 

CO 
CO 


i—i 

4 

vo' 






1—1 
CS 




CM 


o 

CM 
i—i 


1 


OS 


§ 


S3 


< 
< 






CO 
CO 




i— • 






CS 


1 CHAIN 1 


a 






< 




0 


< 


< 


< 






ltnm 


lwio 


lwit 


2dli 
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PDB annotation 


COMPLEX (NUCLEAR 
PROTEIN/RNA) COMPLEX 
(NUCLEAR PROTEIN/RNA), 
RNA, 

SNRNPJRIBONUCLEOPROTE 
IN 


COMPLEX (NUCLEAR § 
PROTEIN/RNA) COMPLEX 1 
(NUCLEAR PROTEIN/RNA), 
RNA, 

SNRNPJOBONUCLEOPROTE 


5 


COMPLEX (NUCLEAR 
PROTEIN/RNA) COMPLEX 
(NUCLEAR PROTEIN/RNA), 
RNA, 

SNRNPJUBONUCLEOPROTE 


s 


COMPLEX (NUCLEAR 
PROTEIN/RNA) COMPLEX 
(NUCLEAR PROTEIN/RNA), 
RNA, 

SNRNP,RffiONUCLEOPROTE 
IN V 


COMPLEX (NUCLEAR « 
PROTEIN/RNA) COMPLEX * • ' 
(NUCLEAR PROTEIN/RNA), *j 
RNA, 1 
SNRNPJRIBONUCLEOPROTE 5 = 
IN U 


GROWTH FACTOR/GROWTHg ] 
FACTOR RECEPTOR FGF, f j 
FGFR, IMMUNOGLOBULIN- I 
LIKE, SIGNAL J 
TRANSDUCTIONS 
DIMERIZATION, GROWTH M 
FACTOR/GROWTH FACTOR f J 
RECEPTOR fii 


r 

1 

o 

1 

6 


Compound 


t U2 RNA HAIRPIN IV; 
CHAIN-.Q^TOA'; 
CHAIN: A,C;U2B"; 
CHAIN: B,D; 


U2 RNA HAIRPIN IV; 
CHAIN: Q, R;U2A'; 
CHAIN: A,C;U2B"; 
CHAIN: B,D; 


U2 RNA HAIRPIN IV; 
CHAIN: Q,R;U2A»; 
CHAIN: A, C;U2B"; 
CHAIN: B,D; 


U2 RNA HAIRPIN IV; 
CHAIN:Q,R;U2A , ; 
CHAIN: A, C;U2B"; 
CHAIN: B,D; 


U2 RNA HAIRPIN IV; 
CHAIN: Q,R;U2A'; 
CHAIN: A,C;U2B"; 
CHAIN: B,D; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: C,D; 


I FIBROBLAST GROWTH 


SEQFOLD 
score 
















PMF 
score 


0.16 


0.51 


0.90 


-0.02 


0.78 


0.24 


0.18 


Verify 
score 


i -0.20 
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9 
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CO 
00 






S 

CO 




ON 
rH 


ON 

vn 


START 
AA 


o 




vo 

00 


i-H 
«— 1 
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ID 


< 
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< 


U 
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a 
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Icvs 
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SUBUNTT; GAMMAl, 
TRANSDUCIN GAMMA 
SUBUNTT; COMPLEX (GTP- 
BINDING/TRANSDUCER), G 
PROTEIN, HETEROTRIMER 2 
SIGNAL TRANSDUCTION 






-M 






J SOB 


|/"DJLfi 




PDB annotation 




COMPLEX 

(KINASE/INHIBITOR) CDK6; 
P19INK4D; CYCLIN 
DEPENDENT KINASE, 
CYCLIN DEPENDENT 
KINASE INHIBITORY 2 
PROTEIN, CDK, INK4, CELL 
CYCLE, COMPLEX 
(KINASE/INHIBITOR) 
HEADER HELIX 


COMPLEX (INHIBITOR 
PROTEIN/KINASE) 
INHIBITOR PROTEIN, 
CYCLIN-DEPENDENT 
KINASE, CELL CYCLE 2 
CONTROL, ALPHA/BETA, 
COMPLEX (INHIBITOR 
PROTEIN/KINASE) 


PHOSPHOTRANSFERASE 
PROTEIN KINASE ICKI 18 




OXYGEN TRANSPORT 
OXYGEN TRANSPORT, 
HEME, RESPIRATORY 
PROTEIN, ERYTHROCYTE 


OXYGEN TRANSPORT 
OXYGEN TRANSPORT, 
HEME, RESPIRATORY 
PROTEIN, ERYTHROCYTE 


OXYGEN TRANSPORT 
OXYGEN TRANSPORT, 
HEME, RESPIRATORY 


Compound 


GAMMA; CHAIN: G; 




CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A, C; 
CYCLIN-DEPENDENT 
KINASE INHIBITOR; 
CHAIN: B,D; 


CYCLIN-DEPENDENT 
KINASE 6; CHAIN: A; 
P19INK4D; CHAIN: B; 


CASEIN KINASE I DELTA; 
1CKI6CHAIN:A,B; 1CKI7 




< 

I 

Q 

a « 


HEMOGLOBIN; CHAIN: A, 
B 


HEMOGLOBIN; CHAIN: A, 
B 


h 




































135.7J 


CO 

cn 
rS 


















o 


PMF 
score 






0.03 


0.24 


0.17 




1.00 












| -0.23 


o 
9 


cn 










> * 






cn 

© 
■ 




d 






Psi 
Blast 






0.00099 


0.0023 


0.0066 




2e-47 


6 

CN 


CO 

cn 
cn 


M 










r-H 




i-H 


*-* 


*— • 


START 
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cn 






i— ( 


1-M 
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00 
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PDB annotation 

i 


, CHIMERA PROTEIN, 

1 RESPIRATORY PROTEIN, 

I HEME 






OXYGEN 

STORAGE/TRANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE) 1, HEMOGLOBIN, 
AVIAN, HIGH 2 
COOPERATIITY, OXYGEN 
TRANSPORT 


OXYGEN 

STORAGE/TRANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE) 1, HEMOGLOBIN, 
AVIAN, HIGH 2 
COOPERATIITY, OXYGEN 
TRANSPORT 


Sag" § 
&S§ 2 

1 8 i - b" 






OXYGEN TRANSPORT X- IJ j 
RAY STUDY, PORCINE R 1 
HEMOGLOBIN. ARTIFICIAL fj! 


Compound 


BETA-ALPHA; CHAIN: A, 
B,C,D; 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
IHBH 3 


OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
IHBH 3 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; 
CHAIN: B.D; 


HEMOGLOBIN D; CHAIN: 
A. C: HEMOGLOBIN D: 


CHAIN: B,D; 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; .j 
CHAIN: B, D; 


OXYGEN TRANSPORT 
HEMOGLOBIN (DEOXY) 
1HDA3 


OXYGEN TRANSPORT 
HEMOGLOBIN (DEOXY) 
1HDA3 


PORICINE HEMOGLOBIN 
(ALPHA SUBUNTT); 
CHAIN: A, C; PORICINE 


SEQFOLD 
score 




oo 

cn 
f-H 






148.19 


96.34 




136.61 




PMF 
score 






LOO 


LOO 






1.00 




LOO 


Verify 
score 






0.61 


! 0.93 






0.49 




0.78 
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Blast 
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ON 
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00 

cn 

VO 

vd 


3.3e-47 
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cn 
cn 
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cn 
cn 






•H 
Tf 
r-» 


l-H 
»— ( 


*-4 

XT 




oo • 
cn 


1—1 
r— * 


f-H 

T"H 


r-H 
rH 


START 
AA 






CN 




»— 1 






f-H 




CHAIN 
ID 




<: 


< 


< 


< 




< 


< 


< 






Ihbh 


Ihbh 


Ihbr 


Ihbr 


Ihbr 


lhda 


lhda 


lqpw 


SEQID 
NO: 
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cn 
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cn 
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cn 
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/ 



PDB annotation 






OXYGEN 

STORAGE/TRANSPORT 
HEME, OXYGEN DELIVERY 
VEHICLE, BLOOD 
SUBSTITUTE 


OXYGEN TRANSPORT « 
OXYGEN TRANSPORT, -J 
CHIMERA PROTEIN, * 
RESPIRATORY PROTEIN, *j 
HEME '% 


OXYGEN TRANSPORT £ 
OXYGEN TRANSPORT, Q 
CHIMERA PROTEIN, £ 
RESPIRATORY PROTEIN, 5 
HEME V 


mlL ff i 


Compound 


OXYGEN TRANSPORT 
HEMOGLOBIN 
THIONVILLE ALPHA 
CHAIN MUTANT WITH 
VAL 1 IBAB 3 REPLACED 
BY GLU AND AN 
ACETYLATED MET 
BOUND TO THE IBAB 4 
AMINO TERMINUS IBAB 5 


OXYGEN TRANSPORT 
HEMOGLOBIN 
THIONVILLE ALPHA 
CHAIN MUTANT WITH 
VAL 1 IBAB 3 REPLACED 
BY GLU AND AN 
ACETYLATED MET 
BOUND TO THE IBAB 4 
AMINO TERMINUS IBAB 5 


DEOXYHEMOGLOBIN 
(ALPHA CHAIN); CHAIN: 
A; DEOXYHEMOGLOBIN 
(BETA CHAIN); CHAIN: B, 
D; 


MODULE-SUBSTITUTED . 
CHIMERA HEMOGLOBIN 


BETA-ALPHA; CHAIN: A, . 
B,C,D; 


MODULE-SUBSTITUTED 
CHIMERA HEMOGLOBIN 


BETA-ALPHA; CHAIN: A, 
B,C,D; 


OXYGEN TRANSPORT 
MYOGLOBIN 
COMPLEXEDWITH 
CYANIDE 1EMY3 1EMY 
107 HEME PROTEIN, 
GLOBIN FOLD 1EMY 5 


SEQFOLD 
score 


113.33 


! 86.57 






94.07 




PMF 
score 






0.25 


0.99 




0.95 


Verify 
score 






0.23 


0.13 




0.44 


Psi 
Blast 
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CQ 
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SEQ ID 
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PDB annotation 










OXYGEN- v 
STORAGE/TRANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE) 1, HEMOGLOBIN, 
AVIAN, HIGH 2 
COOPERATIITY, OXYGEN 
TRANSPORT 


OXYGEN 

STORAGE/TRANSPORT HB 
D; HB D HEMOGLOBIN D (R- 
STATE) 1, HEMOGLOBIN, 1? 
AVIAN, HIGH 2 f? 
COOPERATIITY, OXYGEN ^ j 
TRANSPORT il 


OXYGEN 

STORAGE/TRANSPORT HB p 
D; HB D HEMOGLOBIN D (R-M I 
STATE) 1, HEMOGLOBIN, C 1 
AVIAN, HIGH 2 p} 
COOPERATIITY, OXYGEN V % 
TRANSPORT fi 




IS 


Compound 


OXYGEN TRANSPORT 
HEMOGLOBIN mEOXY. 


, HUMAN FETAL F=/H$=) 
1FDHG 1 1FDHH 2 


OXYGEN CARRIER 
' HEMOGLOBIN (DEOXY) 
IHBH 3 


1 OXYGEN CARRIER 
HEMOGLOBIN (DEOXY) 
I IHBH 3 


OXYGEN CARRIER 
HEMOGLOBIN fDEOXYl 


IHBH 3 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; 
CHAIN: B,D; 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; 
CHAIN: B, D; 


HEMOGLOBIN D; CHAIN: 
A, C; HEMOGLOBIN D; 
CHAIN: B,D; 


OXYGEN TRANSPORT 
HEMOGLOBIN (DEOXY) 
IHDA 3 


OXYGEN TRANSPORT 
HEMOGLOBIN (DEOXY) 


SEQFOLD 
score 


100.14 




101.11 


76.75 




117.48 


81.42 




105.99 


PMF 
score 




0.36 






1.00 






LOO 




Verify 
score 




0.16 






0.75 






0.49 
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cn 
cn 

o 
cn 


ON 

cn 
cn 


3.3e-39 


NO 
«? 

u 

VO 
VO 


ON 

cn 


ON 

cn 


CN 

cn 
cn 


1 

cn 

H 


1.3e-41 
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SEQ ID 
NO: 
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PDB an 










Compound 


GLYCOSYL) LYSOZYME 
(E.C.3.2.1.17) MUTANT 
WITH CYS 54 REPLACED 
BY THR, 119L3CYS97 
REPLACED BY ALA, ALA 
134 REPLACED BY SER 
(C54T,C97A, 119L 4 A134S) 
119L5 


HYDROLASE(0- 
GLYCOSYL) LYSOZYME 
(E.C.3.2.1.17) MUTANT 
WITH CYS 54 REPLACED 
BYTHR,119L3CYS97 
REPLACED BY ALA, ALA 
134 REPLACED BY SER 
(C54T,C97A, 1 19L 4 A134S) 
U9L5 


HYDROLASE (O- 
GLYCOSYL) LYSOZYME 
(E.C.3.2.1.17) MUTANT 
WITH THR 34 REPLACED 
BYALA,174L3LYS35 
REPLACED BY ALA, SER 
36 REPLACED BY ALA, 
PRO 37 174L 4 REPLACED 
BY ALA, SER 38 
REPLACED BY ASP, ASN 
40 REPLACED BY 174L 5 
ALA, SER 44 REPLACED 
BY ALA, GLU 45 
REPLACED BY ALA, ASP 
47 174L 6 REPLACED BY 
ALA, LYS 48 REPLACED 
BY ALA, CYS 54 
REPLACED BY 174L 7 THR, 
CYS 97 REPLACED BY 

AT A 


00 

1 

co oo 

i! 
ill 


SEQFOLD 
score 








PMF 
score 




0.11 


0.03 


Verify 
score 
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PDB annotation 


TRANSCRIPTION 
REGULATION PROTO- 
ONCOGENE, NUCLEAR 
BODIES (PODS), LEUKEMIA, 
2 TRANSCRIPTION 
REGULATION 


TRANSCRIPTION H 
REGULATION PROTO- 1 
ONCOGENE, NUCLEAR 
BODIES (PGD$), LEUKEMIA, 
2 TRANSCRIPTION 
REGULATION 






METAL BINDING PROTEIN 
RING FINGER PROTEIN 
MATl; RING FINGER 
(C3HC4) %\ 


DNA-BINDING PROTEIN f | 
V(D) J RECOMBINATION J 
ACTIVATINGPROTEINl; *J 
RAG1,V(D)J Jl 
RECOMBINATION, £j 
ANTIBODY, MAD, RING if! 
FINGER, 2 ZINC BINUCLEARg ! 
CLUSTER, ZINC FINGER, "fl \ 
DNA-BINDING PROTEIN 


IE 




s 


Compound 


! TRANSCRIPTION FACTOR 
PML; CHAIN: NULL; 


TRANSCRIPTION FACTOR 


Li 


VIRUS EQUINE HERPES 
VIRUS-l (C3HC4, OR RING 
DOMAIN) ICHC 3 (NMR, 1 
STRUCTURE) ICHC 4 


VIRUS EQUINE HERPES 
VIRUS-l (C3HC4, OR RING 
DOMAIN) ICHC 3 (NMR, 1 


1 STRUCTURE) ICHC 4 


CDK-ACTIVATING 
KINASE ASSEMBLY 
FACTOR MATl; CHAIN; A; 




• 




VIRUS EQUINE HERPES 
VIRUS-l (C3HC4, OR RING : 
DOMAIN) ICHC 3 (NMR, 1 i 
STRUCTURE) ICHC 4 | 


VIRUS EQUINE HERPES 


SEQFOLD 
score 




















PMF 
score 


§ 
o 


3 
d 


8 
d 


o 
o 


S 
d 


d 




oo 
d 


oo 
d 


Verify 
score 


ON 

oo 

d 

i 


-0.59 


ON 
i-H 

d 


i-H 
O 

d 


c5 

d 
i 


oo 
o 
d 




o 

d 
• 


9 


Psi 
Blast 


CM 

1-H 

<£> 
co 

CM 


L7e-05 


NO 
i-H 

A- 


i-H 
i-H 


i-H 

VO 
VO 


s 

i-H 




s 

& 

00 
vd 


I 

ON 
ON 




OO 
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ro 
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CO 


CO 


CO 
CO 
CO 


00 
CO 
CO 
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CO 




*n 
vo 

i-H 


m 
vn 

i-H 


START 
AA 


oo 

CM 






oo 
oo 

CM 


ON 
OO 
CM 


00 

s 
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O 
1-H 
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O 

i-H 


|a 










< 












1 

1-H 
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o 

■g 

fH 


1 

i-H 


i-H 


1 

i-H 




o 


t-H 


SEQID 
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PDB annotation 


BINDING PROTEIN 




HYDROLASE ERA, GTPASE, 
RNA-BINDING, RAS-LIKE, 
HYDROLASE 




TRANSLATION EF-TU; 
GTPASE, MOLECULAR 
SWITCH, TRNA, RIBOSOME, 
Q-BETA REPLICASE, 2 
CHAPERONE, DISULFIDE 
ISOMERASE 


TRANSLATION EF-G; BENT 
CONFORMATION, VISIBLE 
DOMAIN m, MUTATION 
HIS573ALA 


TRANSLATION 
TRANSLATIONAL GTPASE % 


PROTEIN BINDING EF-G; EF-J 
G ELONGATION FACTOR, U 
TRANSLOCASE, RIBOSOME, J 
ELONGATION, 2 J* 
TRANSLATION, PROTEIN W 
SYNT FACTOR, GTPASE, C 
GTP BINDING, 3 ft 
GUANOSINE NUCLEOTIDE \ 
BINDING,, PROTEIN g 
BINDING h 




taw. 

i 

is 

u o « 

ill 
1 


Compound 






GTP-BINDING PROTEIN 
ERA; CHAIN: A,B; 


TRANSPORT AND 
PROTECTION PROTEIN 
ELONGATION FACTOR TU 
(DOMAIN I) - 
*GUANOSINE 
DIPHOSPHATE 1ETU4 
COMPLEX 1ETU 5 


ELONGATION FACTOR TU 
(EF-TU); CHAIN: A; 




ELONGATION FACTOR G; 
CHAIN: A; 




TRANSLATION 
INITIATION FACTOR 
IF2/EIF5B; CHAIN: A; 


ELONGATION FACTOR G; ! 
CHAIN: A; ELONGATION ! 
FACTOR G DOMAIN 3; 
CHAIN: B; | 




| QGSR ZINC FINGER 


PEPTIDE; CHAIN: A; 
DUPLEX 


SEQFOLD 
score 






















PMF 
score 






-0.12 


-0.14 


9 


9 


-0.09 


-0.14 




0.23 


Verify 
score 






0.12 


0.21 


0.09 


0.17 


0.23 


0.19 




-0.28 


Psi 
Blast 
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CO 
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1.7e-12 


00 

00 
vd 


M 




ON 
ON 


M 






a 






3 


CO 

cs 


»o 
*n 

1— 1 
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PDB annotation 


STRUCTURE, COMPLEX 
(ZINCFINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 

INTERACTION, PROTEIN M . 
DESIGN, 2 CRYSTAL fj 
STRUCTURE, COMPLEX T 
(ZINC FINGER/DNA) | 


COMPLEX (ZINC | 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINCFINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA | 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINCFINGER/DNA) \ 


COMPLEX (ZINC T 
FINGER/DNA) ZINC FINGER^ 
PROTEIN-DNA ^ 
INTERACTION, PROTEIN ^ i 
DESIGN, 2 CRYSTAL J 
STRUCTURE, COMPLEX H 
(ZINC FINGER/DNA) U| 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER |\ 
PROTEIN-DNA \ 
INTERACTION, PROTEIN c 
DESIGN, 2 CRYSTAL Jj 
STRUCTURE, COMPLEX F 
(ZINC FINGER/DNA) ft 


i 

ii 

p § 


Compound 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 


b* 

u 

J 

i 

B 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C,F,G; 


DNA; CHAIN: A.B.D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER | 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; . 
CONSENSUS ZINC FINGER 


SEQFOLD 
score 




107.60 












PMF 
score 






1.00 


1.00 


1.00 


1.00 


1.00 


Verify 
score 






0.34 


0.36 


0.32 


0.47 


0.68 
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Imey 
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Imey \ 


Imey 
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NO: 
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PDB annotation 


ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 


PROTEIN RECOGNITION, o 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (DNA-BINDING J 
PROTEIN/DNA) FIVE- M 
FINGER GLI; GLI, ZINC ~ 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GU; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 13 
BINDING PROTEIN/DNA) R 


:TT1!S 

Lgil 

ml 

pil| 

8 S § s pq 


0 


TRANSPORT PROTEIN TC4;fU 
GTPASE, NUCLEAR C 
TRANSPORT, TRANSPORT *4 
PROTEIN H 


TRANSPORT PROTEIN TC4;?T 
GTPASE, NUCLEAR RJ 
TRANSPORT, TRANSPORT fd 
PROTEIN pj 


Compound 




ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAM: CD; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAM: C,D; 


ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAIN: C,D; 


ZINC FINGER PROTEIN 
GLIl; CHAIN: A; DNA; 
CHAIN: C, D; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 


5 

J 




GTP-BINDING PROTEIN 
RAN; CHAIN: A, B; 


GTP-BINDING PROTEIN 
RAN; CHAIN: A, B; 


SEQFOLD 
score 






99.97 










55.66 




PMF 
score 




1.00 




1.00 


ON 

On 
d 


0.64 






1.00 


Verify 
score 




0.58 




0.29 
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0.09 






0.62 
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PDB annotation 


(GTPASE 

ACTIVATION/PROTO- 
ONCOGENE), GTPASE, 2 
TRANSITION STATE, GAP 


COMPLEX (GTP- 
BINDING/EFFECTOR) RAS- 
RELATED PROTEIN RAB3 A; 
COMPLEX (GTP- | 
BINDING/EFFECTOR), G • I 
PROTEIN, EFFECTOR, 
RABCDR, 2 SYNAPTIC 
EXOCYTOSIS,RAB 
PROTEIN, RAB3A, 
RABPHILIN 


COMPLEX (GTP- 
BINDING/EFFECTOR) RAS- 
RELATED PROTEIN RAB3A; 
COMPLEX (GTP- 
BINDING/EFFECTOR), G 
PROTEIN, EFFECTOR, 
RABCDR, 2 SYNAPTIC 
EXOCYTOSIS.RAB 
PROTEIN, RAB3 A, 
RABPHILIN % 


PROTEIN BINDING EF-G; EBf 
G ELONGATION FACTOR, V; 
TRANSLOCASE, RIBOSOME^ 
ELONGATION, 2 £ 
TRANSLATION, PROTEIN G 
SYNT FACTOR, GTPASE, U 
GTP BINDING, 3 g 
GUANOSINE NUCLEOTIDE ft 
BINDING,, PROTEIN 
BINDING 


HYDROLASE G PROTEIN, ^ 
VESICULAR TRAFFICKING, H 
GTP HYDROLYSIS, RAB 2 ft 
PROTEIN, ft 
NEUROTRANSMITTER f\ 


1 
1 




RAB-3 A; CHAIN: A; 
RABPHILIN-3A; CHAIN: B; 


ii 

11 


ELONGATION FACTOR G; 
CHAIN: A; ELONGATION 
FACTOR G DOMAIN 3; 
CHAIN: B; 

■ 


RAB3A; CHAIN: A; 


SEQFOLD 
score 






63.80 




71.97 


PMF 
score 




1.00 




0.35 




Verify 
score 




0.74 
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PDB annotation 


REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE m, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) ( 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE HI, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA 
POLYMERASE HI, 2 
TRANSCRIPTION 
INITIATION, ZINC FINGER 
PROTEIN 


gegr/iNreii 

OO ^ C 
F 1 F § W C N £ 

3§8§3&Bs 

898a&2lg<i 


REGULATION/DNA) YING- \ 
YANG 1; TRANSCRIPTION g 
INITIATION, INITIATOR JJ 
ELEMENT, YYl, ZINC 2 J" 
FINGER PROTEIN, DNA- fU 
PROTEIN RECOGNITION, 3 fy 
COMPLEX (TRANSCRIPTION^ [! 


Compound 


RIBOSOMAL RNA GENE; 1 


« 


CO [7 

<^ 

§ c 


CHAIN: B, C,E,F; 


TFHIA; CHAIN: A, D; 5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


TFIEA;CHA1N:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


||| 
i i 


DNA; CHAIN: A, B; 


SEQFOLD 
score 








95.69 




PMF 1 


score 




0.04 


0.70 




0 


Verify 
score 




-0.09 


0.04 
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PDB annotation 


PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZnSTC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- * 
BINDING PROTEIN/DNA) 1 




SERINE ESTERASE 
HYDROLASE, SERINE 
ESTERASE, GLYCOPROTEIN 




RNA-BINDING 
PROTEIN/RN A TRA PRE- 
MRNA; SPUCING 
REGULATION, RNP 
DOMAIN, RNA COMPLEX 


GENE REGULATION/RNA 
POLY(A) BINDING PROTEIN 
1, PABP 1; RRM, PROTEIN- 
RNA COMPLEX, GENE 
REGULATION/RNA % 

KKT 


RNA BINDING PROTEIN %. 
RNA-BINDING DOMAIN 2 


STRUCTURAL PROTEIN ng 
PROTEIN C23; RNP, RBD, ¥ ' 
RRM, RNA BINDING G 
DOMAIN, NUCLEOLUS fl 


STRUCTURAL PROTEIN \ 
PROTEIN C23; RNP, RBD, p 
RRM, RNA BINDING y 
DOMAIN, NUCLEOLUS L 


NUCLEAR PROTEIN 
HETEROGENEOUS H= 
NUCLEAR H 


Compound 


GLI1; CHAIN: A; DNA; 
CHAIN: QD; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: C, D; 




CUTINASE; CHAIN: NULL; 






lit-tp 


HU ANTIGEN C; CHAIN: A; 


NUCLEOLINRBDl; 
CHAIN: A; 


NUCLEOLIN RBD2; 
CHAIN: A; 


HNRNP Al; CHAIN: NULL; 


SEQFOLD 
score 
























PMF 
score 




1.00 




0.01 




1.00 


0.92 


0.95 
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d 


0.74 


0.80 


Verify 
score 




O 
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0.58 


0.17 
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0.51 
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PDB annotation 


(PROTEASE/INHIBITOR) 
COMPLEX 

(PROTEASE/INHIBITOR), 
TISSUE KALLIKREIN, 
SERINE 2 PROTEASE, 
TRYPSIN, PSA, KININ, 
SERPIN 


GLYCOPROTEIN L 
GLYCOPROTEIN _J 


GLYCOPROTEIN 
GLYCOPROTEIN 


GLYCOPROTEIN 
GLYCOPROTEIN 


SERINE PROTEASE 
INHIBITOR FACTOR XA 
INHIBITOR; ANTISTASIN, 
CRYSTAL STRUCTURE, 
FACTOR XA INHIBITOR, 2 
SERINE PROTEASE 
INHIBITOR, THROMBOSIS 








COMPLEX (GTPASE- 
ACTIVATING/GTP-BINDINGg 
COMPLEX (GTPASE- ri 
ACTIVATING/GTP- 
BINDING), GTPASE ' 
ACTIVATION • C 


TRANSPORT PROTEIN TC4; ¥ 

GTPASE, NUCLEAR ft 

TRANSPORT, TRANSPORT f* 

PROTEIN hi 
i — . 


Compound 


X, Y; HIRUSTASIN; CHAIN: 
I, J; 


LAMININ; CHAIN: NULL; 


! LAMININ: CHAIN: NULL: I 




LAMININ; CHAIN: NULL; 


ANTISTASIN; CHAIN: 
NULL; 




LECTIN (AGGLUTININ) 
WHEAT GERM 
AGGLUTININ (ISOLECTIN 
2)9WGA3 


LECTIN (AGGLUTININ) 
WHEAT GERM 


AGGLUTININ (ISOLECTIN 
2)9WGA3 




P50-RHOGAP; CHAIN: A, B, 
C;CDC42HS; CHAIN: D,E, 
F; 


GTP-BINDING PROTEIN 
RAN: CHAIN: A.B: 




SEO FOLD 1 


score 








73.03 




60.96 






66.17 


97.94 


PMF 1 
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-0.02 


0.1 1 




0.18 




-0.19 
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0.40 
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PDB annotation 


^ 


COMPLEX (SMALL 
GTPASE/NUCLEAR 
PROTEIN) COMPLEX 
(SMALL GTPASE/NUCLEAR 
PROTEIN), SMALL GTPASE, 
2 NUCLEAR TRANSPORT 


COMPLEX(GTPASE 
ACTIVATN/PROTO- 
ONCOGENE) GTPASE- 
ACTIVATING PROTEIN 
RHOGAP; COMPLEX 
(GTPASE 

ACTIVATION/PROTO- 
ONCOGENE), GTPASE, 2 
TRANSITION STATE, GAP 


COMPLEX (GTP- ~ 
BINDING/EFFECTOR) RAS- * 
RELATED PROTEIN RAB3 Afi 
COMPLEX (GTP- \ 
BINDING/EFFECTOR), G £j j 
PROTEIN, EFFECTOR, Oj ' 
RABCDR, 2 SYNAPTIC Q 
EXOCYTOSIS,RAB «y 
PROTEIN, RAB3 A, § ^ 
RABPHBLIN 

_ _ . " 1 ■ 3?*a 


COMPLEX (GTP- W 
BINDING/EFFECTOR) RAS- jM> 
RELATED PROTEIN RAB3 Afy 
COMPLEX (GTP- L | 
BINDING/EFFECTOR), G L* 


Compound 


RAS P21 PROTEIN 
MUTANT WITH GLY 12 
REPLACED BY PRO IPU 3 


S lit 

3 § □ Q 


TRIPHOSPHATE IPU 5 || 


u _ |ff 

S Put s 


i 

1 

1 

" 
i 


P50-RHOGAP; CHAIN: A; 
TRANSFORMING PROTEIN 
RHOA; CHAIN: B; 


RAB-3A; CHAIN: A; 
RABPHELIN-3A; CHAIN: B; 


< CO 

□ 3 ' 

il 


SEQFOLD 


2 




104.86 


73.28 


115.76 




PMF 1 




B 










1.00 




score ! 










0.76 
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PDB annotation 


INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 

INTERACTION, PROTEIN n 
DESIGN, 2 CRYSTAL 1 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 

FINGER/DNA) ZINC FINGER, 

PROTEIN-DNA 

INTERACTION, PROTEIN 

DESIGN, 2 CRYSTAL 
i STRUCTURE, COMPLEX 
i (ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA | 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX ^ 
(ZINC FINGER/DNA) n 


DNA-BINDING PROTEIN . 
V(D)J RECOMBINATION 
ACTIVATING PROTEIN 1; N 
RAG1,V(D)J Og 
RECOMBINATION, yf 
ANTIBODY, MAD, RING Q 
FINGER, 2 ZENC BINUCLEARy 
CLUSTER, ZINC FINGER, 
DNA-BINDING PROTEIN J? 


go ^ 
^2 g Q £j 


Compound 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


RAGl; CHAIN: NULL; 

1 


TRANSCRIPTION FACTOR 
IDA; CHAIN: A; 5S RNA 
GENE; CHAIN: E, F; 


SEQFOLD 
score 














PMF 
score 




0.64 


0.99 


0.95 


0.33 


0.62 


Verify 
score 




0.37 


0.58 


0.08 


-0.66 


0.29 
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PDB annotation 


TRANSFERASE, 
SERINE/THREONINE- 
PROTEIN KINASE, 2 PROTO- 
ONCOGENE, ZINC, ATP- 
BINDING, PHORBOL-ESTER 
BINDING 


SERINE/THREONINE I 
PROTEIN KINASE L 
TRANSFERASE, 
SERINE/THREONINE- 
PROTEIN KINASE, 2 PROTO- 
ONCOGENE, ZINC, ATP- 
BINDING, PHORBOL-ESTER 
BINDING 


! SERINE/THREONINE 

! PROTEIN KINASE 
TRANSFERASE, 
SERINE/THREONINE- 
PROTEIN KINASE, 2 PROTO- 
ONCOGENE, ZINC, ATP- 
BINDING, PHORBOL-ESTER 
BINDING ! 


PHOSPHOTRANSFERASE 


PHOSPHOTRANSFERASE £ 


PHOSPHOTRANSFERASE * 


CALCIUM-BINDING m 
PROTEIN RAT BRAIN PKC- 
G; CALCIUM-BINDING p 
PROTEIN, PROTEIN KINASE^} 
C, PKC, TRANSFERASE 11 


CAIOUM-BINDING ' 
PROTEIN RAT BRAIN PKC- O 
G; CALCIUM-BINDING H 
PROTEIN, PROTEIN KINASEfy 
C, PKC, TRANSFERASE fj 
CALCIUM-BINDING r 


Compound 




RAF-1; CHAIN: NULL; 


RAF-1; CHAIN: NULL; 


PROTEIN KINASE C 
DELTA TYPE; IPTQ 4 


PROTEIN KINASE C 
DELTA TYPE; 1PTQ4 


PROTEIN KINASE C 
DELTA TYPE; 1PTQ4 


PROTEIN KINASE C, 
GAMMA TYPE; CHAIN: 
NULL; 


PROTEIN KINASE C, 
GAMMA TYPE: CHAIN: 


NULL; 

TROPONIN C; 1TNX4 


SEQFOLD 
score 




















score 




TOO 


0.19 


0.19 


0.19 


0.31 


0.25 


0.25 


0.11 


Verifv 1 
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9 
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0.17 


-0.36 
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PDB annotation 


FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC i 
FINGER, DNA-BINDING 1 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA^klNDING 
PROTEIN 


COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC x 
FINGER/DNA) COMPLEX p 
(ZINC FINGER/DNA), ZINC |j 
FINGER, DNA-BINDING T 
PROTEIN 

4 


COMPLEX (ZINC J? 
1 FINGER/DNA) COMPLEX g 
! (ZINC FINGER/DNA), ZINC « 
| FINGER, DNA-BINDING \ k 
| PROTEIN J 


1 1 

i 1 

CJ n P-» 

sin" 

|||6 

8 UsS i 


Compound 


PEPTIDE; CHAIN:* A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SUB; CHAIN: B, 
C; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQFOLD 
score 






80.11 










PMF 
score 




1.00 




1.00 


1.00 


0.11 


0.03 


Verify 
score 
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PDB annotation I 


FINQER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC j 
FINGER/DNA) ZINC FINGER, L 
PROTEIN-DNA f 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) TPHIA; 
5S GENE; NMR, TFIHA, 
PROTEIN, DNA, 
TRANSCRIPTION FACTOR, 
5S RNA 2 GENE, DNA 
BINDING PROTEIN, ZINC 
FINGER, COMPLEX 3 
(TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION? 
REGULATION/DNA) f 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA), RNA T 
POLYMERASE HI, 2 
TRANSCRIPTION £jl 
INITIATION, ZINC FINGER W 
PROTEIN fl 


REGULATION/DNA) '/ 
COMPLEX (TRANSCRIPTION: 
REGULATION/DNA), RNA C 
POLYMERASE m, 2 N 
TRANSCRIPTION fl 
INITIATION, ZINC FINGER fl 
PROTEIN iv! 


Compound 


CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D.E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


TRANSCRIPTION FACTOR 
mA; CHAIN: A; 5S RNA 
GENE; CHAIN: E, F; 


TFIIIA;CHAIN:A,D;5S 
RIBOSOMAL RNA GENE; 
CHAIN: B,C,E,F; 


SB 

•iSs iL* 

?.is 
|g| 


SEQFOLD 
score 






60.10 


114.59 

j 




P. 


score 
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PDB annotation 


ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


REGULATION/DNA) YING- 
YANG 1; TRANSCRIPTION k 
INITIATION, INITIATOR 1 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION 
REGULATION/DNA) YING- 
YANG 1;. TRANSCRIPTION 
INITIATION, INITIATOR 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- 
PROTEIN RECOGNITION, 3 
COMPLEX (TRANSCRIPTION 
REGULATION/DNA) 


COMPLEX (TRANSCRIPTION^ ? 
REGULATION/DNA) YING- 2\ 
YANG 1; TRANSCRIPTION 1 { 
INITIATION, INITIATOR *! 
ELEMENT, YYl, ZINC 2 
FINGER PROTEIN, DNA- « 
PROTEIN RECOGNITION, 3 Of 
COMPLEX (TRANSCRIPTION^ 
REGULATION/DNA) S 


COMPLEX (DNA-BINDING ; 7 
PROTEIN/DNA) FIVE- 1[ 
FINGER GLI;GLI, ZINC U 
FINGER, COMPLEX (DNA- |M 
BINDING PROTEIN/DNA) fll 


COMPLEX (DNA-BINDING p| 
PROTEIN/DNA) FIVE- * I 


Compound 


< 

I 


ASSOCIATED VIRUS P5 
INITIATORELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 


INITIATOR ELEMENT 
DNA; CHAIN: A, B; 


YYl; CHAIN: C; ADENO- 
ASSOCIATED VIRUS P5 


INITIATORELEMENT 
DNA; CHAIN: A, B; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: CD; 


ZINC FINGER PROTEIN 
GLI1; CHAIN: A; DNA; 


SEQ FOLD 
score 


r 

c 


8 

o 










PMF 
score 






0.89 
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score 
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PDB annotation 


FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL 
STRUCTURE, COMPLEX 
(ZINC FINGER/DNA) 


COMPLEX (ZINC 
FINGER/DNA) ZINC FINGE 
PROTEIN-DNA 
INTERACTION, PROTEIN 
1 DESIGN, 2 CRYSTAL 


Compound 


CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A,B,D,E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQFOLD 
score 
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score 
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PDB annotation 


FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


COMPLEX (DNA-BINDING 
PROTEIN/DNA) FIVE- 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 


1 COMPLEX (DNA-BINDING fk 
PROTEIN/DNA) FIVE- U 
FINGER GLI; GLI, ZINC 
FINGER, COMPLEX (DNA- 
BINDING PROTEIN/DNA) 




1 CALCIUM-BINDING 1 


PROTEIN 2A9,CACY, 
S100A6, PRA; CALCIUM- 
BINDING PROTEIN, EF- 
HAND, S-100 PROTEIN, NMR 


CALCIUM/PHOSPHOLIPID 
BINDING PROTEIN PI 1, 
CALPACTIN LIGHT CHAIN; 
SlOO FAMILY, EF-HAND 
PROTEIN, LIGAND OF 
ANNEXINH, 2 13 
CAITIUM/PHOSPHOLIPID f\ 
BINDING PROTEIN _j 


METAL BINDING PROTEIN s ' 
SIOOB.SIOOBETA; ^ 
SIOOBETA, S100B, NMR, V 
DIPOLAR COUPLINGS, EF- 
HAND, SlOO 2 PROTEIN, Q 
CALCIUM- BINDING |ty 
PROTEIN, FOUR-HELIX \. 
BUNDLE, THREE- 3 
DIMENSIONAL STRUCTUTUg 
SOLUTION STRUCTURE ^ 


CALCIUM-BINDING IV 1 
PROTEIN SNTNC; CALCIUN^J 
BINDING, REGULATION, jfljfl 


Compound 




ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 


CHAIN: C D; 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 
CHAIN: C,D; . 






CALCYCLIN (RABBIT, 
CA2+); CHAIN: A, B; 


« 

i 

6" 

rH 

1 

rH 

00 


S-100 PROTEIN, BETA 
CHAIN; CHAIN: A,B; 


N-TROPONIN C; CHAIN: ' , 




SEQFOLD 
score 










70.10 
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PMF 
score 
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PDB annotation 




COMPLEX (ZINC 
FINGER/DNA) COMPLEX 
(ZINC FINGER/DNA), 23NC 
FINGER, DNA-BINDING 
PROTEIN 


COMPLEX (ZINC ||| 
1 FINGER/DNA) COMPLEX |H 
(ZINC FINGER/DNA), ZINC 
FINGER, DNA-BINDING 
PROTEIN 




COMPLEX (ZINC 
FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 

INTiSRACTION, PROTEIN Tp 
DESIGN, 2 CRYSTAL R 
STRUCTURE, COMPLEX J 
(ZINC FINGER/DNA) ,! 1 


COMPLEX (ZINC J 
FINGER/DNA) ZINC FINGERM 
PROTEIN-DNA 0? 
INTERACTION, PROTEIN gj 
DESIGN, 2 CRYSTAL fd 
STRUCTURE, COMPLEX C 
(ZINC FINGER/DNA) *2 


COMPLEX (ZINC •* 
FINGER/DNA) ZINC FINGERr 
PROTEIN-DNA fU 
INTERACTION, PROTEIN jlj 
DESIGN, 2 CRYSTAL m 


Compound 


6* 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OUGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


QGSR ZINC FINGER 
PEPTIDE; CHAIN: A; 
DUPLEX 

OLIGONUCLEOTIDE 
BINDING SITE; CHAIN: B, 
C; 


|l 

M 
§ 

PQ < 

II 


BINDING PROTEIN MBP-1 
MUTANT WITH CYS 11 
IBBO 3 REPLACED BY 
ABU (CI 1 ABU) (NMR, 60 


O 

DQ 

PQ 
■— » 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQFOLD 
score 
















PMF 
score 




0.01 


0.25 


0.46 


0.21 
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0.41 


Verify 
score 
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PDB annotation 


KINASE KINASE, SIGNAL 

TRANSDUCTION, 

CALCIUM/CALMODULIN 


TRANSFERASE 
TRANSFERASE, 


SERINE/THREONINE- 
PROTEIN KINASE, CASEIN 
KINASE, 2 SER/THR KINASE Jc 




PC T/ §SS£8 O III i 


PROTEIN KINASE CDK2; f|J 
PROTEIN KINASE, CELL 
CYCLE, m 


Compound 


CALCIUM/CALMODULIN- 
DEPENDENT PROTEIN 
KINASE; CHAIN: NULL; 


PROTEIN KINASE 
CK2/ALPHA-SUBUNIT; 


CHAIN: NULL; 


TRANSFERASE(PHOSPHO 
TRANSFERASE) $C-/AMP$- 
DEPENDENT PROTEIN 
KINASE (B.G2.7.1.37) 
($C/APK$) 1APM3 
(CATALYTIC SUBUNTT) 
ALPHA ISOENZYME 
MUTANT WTTH SER 139 
lAPM 4 REPLACED BY 
ALA (/S139A$) COMPLEX 
WITH THE PEPTIDE 1 APM 
5 INHIBITOR PKI(5-24) 
AND THE DETERGENT 
MEGA-8 1APM6 


TRANSFERASE(PHOSPHO 
TRANSFERASE) $C-/AMP$- 
DEPENDENT PROTEIN 
KINASE (E.C.2.7.1.37) 
($C/APK$) 1APM3 
(CATALYTIC SUBUNTT) 
ALPHA ISOENZYME 
MUTANT WITH SER 139 
lAPM 4 REPLACED BY 
ALA (/S139A$) COMPLEX 
WITH THE PEPTIDE lAPM 
5 INHIBITOR PKIC5-24) 

J XI Hi. Ill >X X V/JiV X XVI. V^w* s*~ f 

AND THE DETERGENT 
MEGA-8 1APM6 


CYCLIN-DEPENDENT 
PROTEIN KINASE 2; 
CHAIN: NULL; 
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PDB annotation 


SERINE/THREONINE- 
PROTEIN KINASE, MAP 
KINASE, 2 ERK2 




SERINE PROTEASE SERINE 
PROTEINASE, TRYPSIN, 
HYDROLASE 


SERINE PROTEASE SERINE M 
PROTEINASE, TRYPSIN, jj& 
HYDROLASE ] 


SERINE PROTEINASE 
TRYPSIN-LIKE SERINE 
PROTEINASE, TETRAMER, 
HEPARIN, ALLERGY, 2 
ASTHMA 


SERINE PROTEASE 
! HYDROLASE, SERINE 
PROTEASE 


SERINE PROTEASE 
j HYDROLASE, SERINE 
I PROTEASE 


SERINE PROTEASE 
PRORENIN CONVERTING 
ENZYME (PRECE), H 
EPIDERMAL GLANDULAR f\ 
KADJLKREIN, SERINE 
PROTEASE, PROTEIN % 5 
MATURATION A. 


% . 

§0 
is 

slg 

D O 1 

u u < 


HYDROLASE, SERINE fd 
PROTEINASE), PLASMA \ 
CALCIUM BINDING, 2 g 
GLYCOPROTEIN, COMPLEX^ 
(BLOOD J 
COAGULATION/INHmrrORSil 


SERINE PROTEASE SERINEfU 
PROTEASE HEADER f0 


Compound 






TRYPSIN; CHAIN: A, B, C, 
D; 


TRYPSIN; CHAIN: A, B, C, 
D; 


go 
pq <: 


ALPHA TRYPSIN; CHAIN: 
A.B; 


ALPHA TRYPSIN; CHAIN: 
A.B; 


cn 

rH 

sjjj 


if 

< v Z 




ALPHA THROMBIN; 
CHAIN: A, B, F, E; 


SEQFOLD 
score 






172.43 




124.61 
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124.65 




PMF 
score 








1.00 




1.00 


0.99 
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PDB annotation 


SERINE PROTEASE SERINE 
PROTEASE, HYDROLASE, 
MAST CELL, ANGIOTENSIN, 
ALPHA 2 

TOLUENESULFONIC ACID 






SERINE PROTEINASE 
SERINE PROTEINASE, 
GLYCOPROTEIN 


els 

q g < 

a g 1 

ooS 
uuE 


COMPLEX, INHIBITOR, "8 
HEMOPHILIA/EGF, BLOOD W 
COAGULATION, 2 PLASMA,^ 
SERINE PROTEASE, ^ 
CALCIUM-BINDING, J: 
HYDROLASE, 3 9 
GLYCOPROTEIN Off 


HYDROLASE p 
MICROPLASMINOGEN, flj 
SERINE PROTEASE, 
ZYMOGEN, pi 


M 

5 

^3 

go 

i 


GROWTH FACTOR 7S NGF; jj U 
GROWTH FACTOR (BETA- Hj 
NGF), HYDROLASE - SERINS | 


Compound 


CHYMASE; CHAIN: NULL; 


COMPLEX(PROTEINASE/I 
NHIBITOR) TRYPSIN 
(E.C.3.4.21.4) COMPLEXED 
WITH INHIBITOR FROM 
BITTER IMCT 3 GOURD 
IMCT 4 


COMPLEX(PROTEINASE/I 
NHIBITOR) TRYPSIN 
(E.C.3.4.21.4) COMPLEXED 
WITH INHIBITOR FROM 
BITTER IMCT 3 GOURD 
IMCT 4 


NEUROPSIN; CHAIN: A, B; 


FACTOR IXA; CHAIN: C, 
L,;D-PHE-PRO-ARG; 




P 


) 
S 


NERVE GROWTH 
FACTOR; CHAIN: A, B, G, 
X,Y,Z; 


SEQ FOLD 

score 


125.55 


173.18 




139.50 


115.05 


114.05 


119.87 


PMF 
score 






1.00 










Verify 
score 






1 0.76 
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PDB annotation 


PROTEINASE 2 (GAMMA- 
NGF), INACTIVE SERINE 
PROTEINASE (ALPHA-NGF) 


GROWTH FACTOR 7S NGF; 1 
GROWTH FACTOR (BETA- 
NGF), HYDROLASE - SERINE 
PROTEINASE 2 (G AMMA- 
NGF), INACTIVE SERINE ~| 
PROTEINASE (ALPHA-NGF) 1 


GROWTH FACTOR 7S NGF; 
GROWTH FACTOR (BETA- 
NGF), HYDROLASE - SERINE 
PROTEINASE 2 (GAMMA- 
NGF), INACTIVE SERINE 
PROTEINASE (ALPHA-NGF) 


COMPLEX (SERINE 
PROTEASE/INHIBITOR) 
TRYPSIN INHIBITOR; 


SERINE PROTEASE, 
INHIBITOR, COMPLEX, 
METAL BINDING SITES, 2 
PROTEIN ENGINEERING, 
PROTEASE-SUBSTRATE 
INTERACTIONS, 3 
METALLOPROTEINS M 


COMPLEX (SERINE 1 J 
PROTEASE/INHIBITOR) ^ 
TRYPSIN INHIBITOR; 
SERINE PROTEASE, «{ 
INHIBITOR, COMPLEX, gjf 
METAL BINDING SITES, 2 g 
PROTEIN ENGINEERING, S 
PROTEASE-SUBSTRATE 
INTERACTIONS, 3 
METALLOPROTEINS Q 


JLE2i: 


Compound 




NERVE GROWTH 
FACTOR; CHAIN: A, B, G, 
X,Y,Z; 


NERVE GROWTH 
FACTOR; CHAIN: A, B, G, 
X,Y,Z; 


ECOTIN; CHAIN: A; 
, ANIONIC TRYPSIN; 
CHAIN: B; 




ECOTIN; CHAIN: A; 
ANIONIC TRYPSIN; 
CHAIN: B; 


HYDROLASE(SERINE 
PROTEINASE) TONIN (E.C. 
NUMBER NOT ASSIGNED) 
1TON4 


SEQFOLD 
score 




134.19 




165.58 




145.44 


PMF 
score 






1.00 




1.00 




Verify 
score 






0.60 




0.67 




Psi 
Blast 
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00 
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CO 
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SEQID 
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§ £ o °i 

00 K Ph Cu CO 
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PDB annotation 


i O 


IMMUNOGLOBULIN 

immunoglobulin, fab 
complex, idiotope, anti- 
idiotope 

1 


IMMUNOGLOBULIN, FAB 
COMPLEX, IDIOTOPE, ANTI- 
IDIOTOPE 


IMMUNE 5 YSTtSJYL YUJN 

WILLEBRAND FACTOR, 
GLYCOPROTEIN IBA 
(A:ALPHA) BINDING, 2 

COMPLEX ng 


B 8 
5 § g 1 

&o U > P 




(ton t m» 


Compound 


ill 

sis 


IG HEAVY CHAINV 
REGIONS; CHAIN: A; IG 
HEAVYCHAINV 
REGIONS; CHAIN: B;IG 
HEAVYCHAINV 
REGIONS; CHAIN: C; IG 
HEAVYCHAINV 
REGIONS; CHAIN: D; 


IG HEAVY CHAilN V 
REGIONS; CHAIN: A; IG 
HRAVY CHAIN V 


REGIONS; CHAIN: B; IG 
HEAVYCHAINV 
REGIONS; CHAIN: C; IG 
HEAVYCHAINV 
REGIONS; CHAIN: D; 


IMMUNOGLOBULIN NMC- 
4IGG1;CHAIN: L; 
IMMUNOGLOBULIN NMC- 
4IGG1;CHAIN:H;V0N 
WILLEBRAND FACTOR; 


CHAIN: A; 


IMMUNOGLOBULIN/VIRU 
S HEMAGGLUTININ 
IGG2A FAB FRAGMENT 
(FAB 26/9) COMPLEXED 
WITH INFLUENZA 1FRG3 
HEMAGGLUTININ H Al 
(STRAIN X47) (RESIDUES 
101-108)1FRG4 


IMMUNOGLOBULIN FAB 
FRAGMENT OF 


SEQFOLD 
score 




65.86 




• 


66.17 


66.74 


PMF 
score 






o 


rH 
O 

o 






1 ! 






as 

1 


r- 
o 






Psi 
Blast 




1.4e-05 


8 

i 


i 




8 

O 

*n 






00 

CN 
CN 




i-H 




00 
CN 
CN 


START 
AA 




*-4 

CN 


cn 


v> 

CO 




CN 


CHAIN 
ID 




Q 


Q 




K 




la 




o 
o 


o 
*o 

1—1 


<E 


00 


I 


SEQID 
NO: 




<N 

J? 


<N 
9 


CN 
CN 


<N 
9 
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PDB annotation 




# 

I 


V^WIVUT l~tl±SX. « 

(IMMUNORECEPTOR/IMMU Ij 
NOGLOBULIN) COMPLEX T 
. (IMMUNORECEPTOR/IMMU 
NOGLOBULIN) 


IMMUNOCiLUliUJLlJN rAo 

FRAGMENT, 

IMMUNOGLOBULIN 


IMMUNOGLOBULIN rAo 

FRAGMENT, 

IMMUNOGLOBULIN 


23 


SYSTEM 


PCX/" 1 
/ 

1 


COMPLEX 

mv/TNyrr tmoht nRT JLIN/AUTOZS 


SOB S 

111! j 

ililll' 


ill 

jSS 

1 B s 

iii 


Compound 


HUMANIZED ANTIBODY 
4D5,VERSION41FVD3 


IMMUNOGLOBULIN FAB 
D44.1 (IGG1.KAPPA) 
(BALB/C MOUSE, 
MONOCLONAL ! 
ANTIBODY) IMLB 5 


' -I 

S?.S? 

25 2 U Bu 


FAB 1583; CHAIN: L,H 


FAB1583; CHAIN: L,H 


IGG3-KAPPA ANTIBODY 
(LIGHT CHAIN); CHAIN: A, 
C;IGG3-KAPPA 
ANTIBODY (HEAVY 
CHAIN); CHAIN: B,D; 


IMMUNOGLOBULIN FAB 
FRAGMENT FROM 
HUMAN 

IMMUNOGLOBULIN IGGl 


(LAMBDA, HIL)8FAB 3 


IGG4REA; CHAIN: A;RF- 
AN IGM/LAMBDA; CHAIN: 
H,L; 


HEMOLIN; CHAIN: A, B; 


SEQFOLD 
score 








67.89 






65.00 






PMF 
score 




0.07 


0.42 




0.19 


0.13 




0.09 


-0.02 


Verify 
score 




-0.26 


0.07 




0.02 


-0.34 




0.08 


0.12 


Psi 
Blast 
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PDB annotation 

FOLD. GLYCOPROTEIN 


CELL ADHESION NCAM; 
NCAM, IMMUNOGLOBULIN 
FOLD, GLYCOPROTEIN 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN \ 
(IG)LIKE DOMAINS f] 
BELONGING TO THE I-SET 2 -' J 
SUBGROUP WITHIN IG-LIKE 
DOMAINS, B-TREFOIL FOLD 


GROWTH FACTOR/GROWTH 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
QG)UKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-LIKE 
DOMAINS, B-TREFOIL FOLD 


GROWTH FACTOK/uKUW Irl 
FACTOR RECEPTOR FGF2; 
FGFR2; IMMUNOGLOBULIN 
(IG)LEKE DOMAINS 
BELONGING TO THE I-SET 2 
SUBGROUP WITHIN IG-UKErJ 
DOMAINS, B-TREFOIL FOLD* 


GROWTH FACTOR/GROW 1 
FACTOR RECEPTOR FGFl; 
FGFRl; IMMUNOGLOBULIN^ 
(IG) LIKE DOMAINS 
BELONGING TO THE I-SET m 
SUBGROUP WITHIN IG-LIKE 
DOMAINS, B-TREFOIL FOLf* 


■5*' ^O^^^fc (SI! ^P*^f *" 

ll§llll 


Compound 

r» n. 


NEURAL CELL ADHESION 
MOLECULE; CHAIN: A, B, 
CD; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G,H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E,F, G,H; 


FIBROBLAST GROWTH 
FACTOR 2; CHAIN: A, B, C, 
D; FIBROBLAST GROWTH 
FACTOR RECEPTOR 2; 
CHAIN: E,F,G,H; 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: CD; 


FIBROBLAST GROWTH 
FACTOR 1; CHAIN: A, B; 
FIBROBLAST GROWTH 
FACTOR RECEPTOR 1; 
CHAIN: C,D; 


SEQFOLD 
score 














PMF 
score 


0.19 


0.03 


0.00 


0.27 


0.21 


8 
9 


Verify 
score 


0.11 


-0.09 


© 


-0.00 


s 
? 


0.12 


Psi 
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5.1e-20 


a 


1-4 


1.4e-23 


00 

f-H 

6 

i-H 

»n 


1.7e-10 




oo 
oo 
oo 


cn 

i-H 

oo 


r- 

00 


00 

oo 
oo 


NO 




START 
AA 


1—* 

cn 
r» 


5? 

vo 


VO 

s 


ON 
ro 


3 
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u 




lepf 


si 

«— < 
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PDB annotation 


PROTEASE 1 


MULTICATALYTIC 
PROTEINASE 
MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING, HYDROLASE, dj 
PROTEASE M 


MULTICATALYTIC 
PROTEINASE 
MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING, HYDROLASE, 
PROTEASE 


MULTICATALYTIC 
PROTEINASE 
MULTICATALYTIC 
PROTEINASE, 20S 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN 
PROCESSING, HYDROLASE^ 
PROTEASE ^ 


MULTICATALYTIC "\ 
PROTEINASE p 
MULTICATALYTIC ^ 
PROTEINASE, 20S 91 
PROTEASOME, PROTEIN 2 
DEGRADATION, ANTIGEN Q 
PROCESSING, HYDROLASES] 
PROTEASE 


MULTICATALYTIC * 
PROTEINASE p 
MULTICATALYTIC ^ 
PROTEINASE, 20S RJ 
PROTEASOME, PROTEIN 2 nj 
DEGRADATION, ANTIGEN ki 


Compound 




20S PROTEASOME; 

CHAIN:A,B,C,D,E,F,G, 

H,I,J,K,L,M,N.O,P,Q, 


20S PROTEASOME; 
CHAIN: A,B,C, D, E, F, G, 
H,I,J,K,L,M,N,0,P,Q, 


20S PROTEASOME; 
CHAIN: A, B, C, D, E, F, G, 
H,U,K,L,M,N.O,P, Q, 


20S PROTEASOME; 
CHAIN:A,B,CD,E,F,G, 
H, I,J,K,L,M,N,0,P.Q, 


w ^-o 

73 M w 

sow 


SEQFOLD 
score \ 






117.29 

i 




201.96 




PMF 
score 




1.00 




LOO 




1.00 


Verify 
score 




0.56 




0.63 




0.76 
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i 




M 




Iryp 


Iryp 


Iryp 


Iryp 


Iryp 


SEQID 
NO: 




oo 

9 


oo 

3 


3 


oo 

5 


oo 

9 



577 



WO 02/081731 



PCTYUS02/01222 




578 



WO 02/081731 



PCT/US02/01222 



PDB annotation 


PROTEIN INTERACTION 
DOMAIN, 

TRANSCRIPTIONAL 2 
REPRESSOR, ZINC-FINGER 
PROTEIN, X-RAY 
CRYSTALLOGRAPHY, 3 
PROTEIN STRUCTURE, A 
PROMYELOCYTIC W 


LEUKEMIA, GENE 
REGULATION 


FINGER/DNA) ZINC FINGER, 
PROTEIN-DNA 
INTERACTION, PROTEIN 
DESIGN, 2 CRYSTAL | 
STRUCTURE, COMPLEX 
fZINC FINGER/DNA) 


with 


III 
y § ^ | < 

2 8 - 2 8 e 

tiPill 

3l«§86fii 


J fill 

^ z § ^ j~ § 5 

liiiii 


Compound 


LEUKEMIA ZINC FINGER 
PROTEIN PLZF; CHAIN: A; 


DNA; CHAIN: A, B, U f 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


J 


DNA; CHAIN: A, B, D, hi; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


.J 6' 

5 W ? 




DNA; CHAIN: A, B, D, E; 
CONSENSUS ZINC FINGER 
PROTEIN; CHAIN: C, F, G; 


SEQFOLD 
score 












PMF 
score 




-0.20 


0.22 


-0.14 


0.82 


Verify 
score 




0.11 


-0.06 


i 


-0.01 


Psi 
Blast 






le-47 
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SEQID 
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PDB annotation 






r j 
i 


COMPLEX (DJNA-BlJNiJIJNvj 
PROTEIN/DNA) FIVE- 
FINGER GLI; GU, ZINC 
FINGER, COMPLEX (DN A- 
BINDING PROTEIN/DNA) 


PCT/' 

!|ogg| 

j E E « 


SIGNALING PROTEIN f i§| 
PHOTORECEPTOR, G Ml 
PROTEIN-COUPLED Q 
RECEPTOR, MEMBRANE jy 
PROTEIN, 2 RETINAL 
PROTEIN, VISUAL PIGMENg 




Compound 


N REGULATION/DNA) 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 
PEPTIDE) COMPLEXED 
WITH 2DRP 3 DNA 2DRP 4 


COMPLEX(TRANSCRIPTIO 
N REGULATION/DNA) 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 
PEPTIDE) COMPLEXED 
WITH 2DRP 3 DNA 2DRP 4 


COMPLEX(TRANSCRIPTIO 
N REGULATION/DNA) 
TRAMTRACK PROTEIN 
(TWO ZINC-FINGER 
PEPTIDE) COMPLEXED 
WITH 2DRP 3 DNA 2DRP 4 


ZINC FINGER PROTEIN 
GUI; CHAIN: A; DNA; 


CHAIN: C, D; 


ZINC FINGER PROTEIN 
GLI1; CHAIN: A; DNA; 
CHAIN: C,D; 


RHODOPSIN; CHAIN: A, B 


GROWTH FACTOR ACIDIC 
FIBROBLAST GROWTH 
FACTOR (AFGF) MUTANT 
WITH CYS 471 AFC 3 


SEQFOLD 
score 














51.39 


PMF 
score 




0.33 


0.52 


0.58 


0.65 


0.01 




Verify 
score 




-0.46 


-0.30 


i -0.44 


-0.26 


-0.47 
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i 



PDB annotation 


GROWTH FACTOR 


GROWTH FACTOR FGF-2; 
GROWTH FACTOR 


HORMONE/GROWTH 
FACTOR BETA-TREFOIL 


HORMONE/GROWTH | 
FACTOR BETA-TREFOIL I 


HORMONE/GROWTH 
FACTOR BETA-TREFOIL, 
HORMONE/GROWTH 
FACTOR 


GROWTH FACTOR AFG; 
2AFG6 


i 

! 


GROWTH FACTOR; 
CHAIN: NULL; 


BASIC FIBROBLAST 
GROWTH FACTOR; 
OL\IN:NIJLL; 


FIBROBLAST GROWTH 
FACTOR 7; CHAIN: A, B; 


FIBROBLAST GROWTH 
FACTOR 7; CHAIN: A, B; 


FIBROBLAST GROWTH 
FACTOR 7/1 CHIMERA; 
CHAIN: A; 


ACIDIC FIBROBLAST 
GROWTH FACTOR; 2AFG 4 
CHAIN: A, B, C t D; 2AFG 5 


SEQFOLD 
score 














PMF 
score 




0.03 


0.76 


0.25 


0.75 


so 
VO 

© 


Verify 
i score 




0.11 


0.40 


0.29 


0.62 


d 
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Table 6 



| SEQJD NO: 


Position of Signal 
Peptide 


Maximum score 


Mean score 


1 


24 


0.978 


0.760 


2 


32 


0.995 


0.681 


3 


37 


0.979 


0.718 


4 


18 


0.925 


0.822 


5 


28 


0.939 


0.749 


6 


41 


0.989 


0.690 


7 


26 


0.960 


0.674 


8 


16 


0.973 


0.925 


9 


24 


0.978 


0.760 


10 


18 


0.887 


0.579 


11 


42 


0.977 


0.587 


12 


21 


0.966 


0.848 


13 


25 


0.993 


0.954 


14 


28 


0.909 


0.664 


16 


23 


0.913 


0.597 


17 


42 


0.978 


0.689 


18 


21 


0.930 


0.662 


19 


45 


0.985 


0.714 


20 


37 


0.992 


0.855 


21 


31 


0.947 


0.775 


22 


20 


0.979 


0.911 


24 


30 


0.924 


0.720 


25 


26 


0.974 


0.824 


26 


28 


0.982 


0.649 


28 


16 


0.912 


0.705 


29 


27 


0.957 


0.652 


30 


22 


0.968 


0.844 


31 


23 


0.952 


0.812 


32 


18 


0.932 


0.884 


33 


29 


0.991 


0.729 1 


34 


26 


0.939 


0.709 


35 


29 


0.961 


0.842 


36 


16 


0.951 


0.777 


37 


27 


0.983 


0.898 


38 


17 


0.991 


0.955 


39 


33 


0.977 


0.822 


40 


17 


0.989 


. 0.969 


41 


30 


0.936 


0.679 


42 


24 


0.993 


0.810 


A A 

44 


22 


0.990 


A AO t 

0.921 


54 


18 


0.925 


0.822 


56 


18 


0.981 


0.951 


60 


28 


0.939 


0.749 


62 


33 


0.979 


0.757 


70 


41 


0.989 


0.690 


79 


26 


0.960 


0.674 


83 


18 


0.979 


0.963 


84 


22 


0.967 


0.792 


87 


25 


0.980 


0.867 


97 


16 


0.973 


0.925 


98 


24 


0.978 


0.760 


99 


17 


0.978 


0.925 
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Table 6 



SEQ.ID NO: 


Position of Signal 
Peptide 


Maximum score 


Mean score 


113 


18 


0.887 


0.579 


115 


18 


0.952 


0.670 


120 


42 


0.977 


0.587 


137 


21 


0.966 


0.848 


140 


25 


0.993 


0.954 


153 


28 


0.909 


0.664 


156 


18 


0.954 


0.747 


174 


23 


0.913 


0.597 


175 


20 


0.986 


0.936 


178 


42 


0.978 


0.689 


180 


32 


0.929 


0.583 


184 


21 


0.979 


0.941 


192 


21 


0.930 


0.662 


200 


45 


0.985 


0.714 


212 


37 


0.992 


0.855 


225 


24 


0.971 


0.882 


228 


L 20 


0.979 


0.911 


237 


17 


0.982 


0.964 


251 


13 


0.918 


0.692 


252 


13 


0.918 


0.692 


256 




0.912 


0.693 


257 


20 


0.912 


0.693 


260 


26 


0.974 


0.824 


262 


18 


0.965 


0.833 


267 


25 


0.956 


0.765 


288 


16 


0.912 


0.705 


289 


18 


0.896 


0.634 


290 


19 


0.966 


0.897 


294 


18 


0.991 


0.973 


295 


20 


0.906 


0.580 


299 


27 


0.957 


0.652 


307 


19 


0.983 


0.871 


310 


22 


0.968 


0.844 


320 


23 


0.952 


0^812 


324 j 


27 


0.982 


0.911 


327 


1 o 

18 


0.983 


0.941 


328 


18 


0.932 


0.884 


332 


27 


0.990 


0.923 


335 


45 


0.983 


0.793 








U. fyj 


346 


29 


0.991 


0.729 


354 


22 


0.978 


0.877 


363 


26 


0.939 


0.709 


364 


22 


0.966 


0.843 


375 


29 


0.961 


0.842 


379 


16 


0.951 


0.777 


401 


44 


0.975 


0.876 


407 


33 


I 0.977 


0.822 


417 


17 


0.989 


0.969 


418 


23 


0.974 


0.799 


422 


18 


0.981 


0.952 


426 


21 


0.982 


0.912 
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Table 6 



SEQJDNO: 


Position of Signal 
Peptide 


Maximum score 


Mean score 


428 


30 


0.936 


0.679 


429 


43 


0.978 




433 


28 


0.993 n 


0.948 


434 


43 


0.930 


0.624 


437 


24 


0.993 


0.810 


438 


16 


0.978 


0.939 
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Table 7 



SEQ ID NO: 


Chromsomal location 


3 


2qll.2 


4 


20pter-pl2.3 


5 


5q31 


6 


19pl2 


7 


19pl2 


8 


5 


11 


12pl3-pl2 


1 12 


pi 1.2-12.3 


13 


19p 


14 


6pl2.1-21.1 


1 15 


19pl3.1 


17 


16ql2-ql3 


19 


15 


20 


15 


22 1 


Xql3.1 


23 


12 


25 


llpl5.5 


26 


20 


27 


22 


28 


12q23-24.1 


29 


20 


! 30 


13 


31 


| 12 


33 


15 


36 


4q28 


37 


14q24.3 


38 


10 


39 


20 


41 


17ql2-q21 


42 


14 


44 


lq24. 1-25.2 


45 


2 


47 


3q21-q25 


48 


9 


49 


14 


50 


6ql4.1-15 


51 


19 


52 


11 


53 


20 


54 


16 


55 


14 


56 


3 


57 


19 


58 


7pl5.1-pl3 


59 


19 


61 


2 


62 


19 


63 


16 


66 


15 


70 


lp3 1.1-33 


71 


9 


72 


16 
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Table 7 



SEQ ID NO: 


Chromsomal location 


74 


5q31-q33 


75 


3p21.1-ql3.13 


76 


2 


77 


2 


78 


21q22.1 


79 


Xpll.22-pll.21 


80 


2 


81 


19 


82 


20 


83 


19pl3.3 


84 


19 


85 


3 


86 


8 


87 


lpl3 


88 


16 


89 


18q21.1-q22 


90 


Ilql3.1-ql3.3 


91 


18pll.23-pll.21 


92 


17 


93 


10 


94 


3 


95 


X 


96 


6^14.2-16.1 


97 


lg21.2-22 


98 


lq2L2-22 


99 


6 


102 


8q22-q23 


103 


lOpll.2 


104 


17 


105 


17 I 


106 


2 


107 


1 


108 


16 


109 


17q21.3-q22 


110 


llq 


111 


3p21.1-ql3.13 


112 


16 


113 


5 


114 


9 


115 


3pl3-q2o.l 


116 


5 


117 


7q31 I 


118 


14 


119 


14 


120 


19 


121 


19 


122 


6q27 


123 


14 


124 


Iq21-q22 


125 


6 


126 


17q25 


127 


15 
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Table 7 



SEQ ID NO: 


Chromsomal location 


129 


14q31 


130 


lp36.1 


131 


11 


i 132 


20 


133 


20pll.23-pll.21 


134 


lp32 


135 


2q31 


136 


X 


138 


12pl3 


139 


9 


140 


p34. 1-34.3 


141 


19ql2 


142 


15q26 


143 


22qll.21 


144 


17ql2 


145 


4pl6.3 


146 


22 


147 


16pll.2 


148 


18ql2 


150 


4 


151 


7pl2-q 11.21 1 


152 


14 


153 


14q32.33 | 


155 


lp34 


156 


16pl3.3 


157 


12pl3.3 


158 


5 


159 


8 1 


160 


19 


161 


4 


162 


1 


163 


llq23 


164 


3 


165 


12q22 


168 


19 


170 


1 


171 


18ql2 


173 


7 


174 


13 


175 




176 


16 


178 


10 


179 


Iq21-q25 


180 


19pl3.3 


181 


1 


184 


lp35.1-36.23 


185 


1 


186 


18 


i 187 


3pl3-q26.1 


188 


3 


189 


17 


190 


6 
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Table 7 



SEQ ID NO: 


Chromsomal location 


193 


llpl5.5 


194 


14q32 


195 


12 


196 


10q24 


198 


lp36.1 


199 


5q22 


200 


11 


201 


2q31 


202 


17 


206 


Xpll.23 


207 


9q34 


208 


19 


209 


20 


210 


llq23 


211 


16pl2 


212 


19ql3.1 


213 


7pl5 


214 


15 


215 


lp36.21-36.33 


216 


11 


217 


22qlL2 


218 


15 


219 


19ql3.4 


222 


19 


223 


lq25.2 


226 


1 


227 


^ lp36. 11-36.23 


228 


Ip36.3-p36.13 


230 


17 


231 


7q33-q34 


232 


3 


233 


9 


234 


10 


235 


17 


236 


4 


237 


19ql3.4 


238 


4q25 


239 


2 


240 


7 


241 


12 


243 


6p21.3 


244 


3pl3-q26.1 


245 


17 


246 


lp34.1 


247 


3q23 


248 


3p21.3 


249 


20 


250 


20 


251 


18ql2-q21 


252 


18ql2-q21 


253 


14 


254 


Ip35.3-p35.1 
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Table 7 



SEQ ID NO: 


Chromsomal location 


256 


6q25-q26 


257 


6q25-q26 


258 


Iq21-q23 


259 


16pl3.2-l6pl3.11 


i 260 


14q21.1-q24.1 


261 


2p23.3-q32.3 


262 


12 


263 


19 


264 


4q28 


265 


2 


266 


2 


267 


Iq21-q23 


268 


20pl2.3-pl3 


269 


4 


270 


6 


271 


2p23.3-ql4.3 


272 


18q21 


273 


18q21 


274 


14q22 


275 


6p21.3 


276 


5 


280 


8 1 


281 


4q22-q24 


282 


2 


283 


7q22-q31.1 


284 


11 


\ 285 


llql2.3 


286 


10 


287 


19 


290 


17 


291 


4q22 


1 292 


lp36. 11-36.23 


293 


19 


294 


22 


296 


3 


297 


4pl6 


298 


6 


299 


8ql3 


300 


20 


301 


15 


302 


22qll.2-q22 


303 


15 


304 


6 


306 


6 


307 


9p24.2 


308 


2p23.3-q24.3 


309 


14 


310 


6 


311 


2 


312 


4 


313 


19pter-19pl3.3 


314 


3 
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Table 7 



SEQ ID NO: 


Chromsomal location 


316 


lip 12- 14.2 


317 


19 


318 


17 


319 


17 


320 


5ql4 


323 


4 


324 


3p 


325 


6p21. 1-2131 


326 


17pll.2 


327 


9 


328 


5q23 


329 


2 


330 


3 ! 


331 


Ip2 1.1-22.1 


332 


9 


333 


7 


334 


llql3 


337 


14 


! 338 


7q35-q36 


339 


13 


340 


6qll. 1-22.33 


| 341 


Ilql2-ql3.1 


343 


10 


344 


16 


345 


16 


346 


llq22 


347 


19 


348 


15q24-q26 


350 


Xpll.21-11.22 


354 


16 


355 


19 


356 


11 


358 


Xpll.23 


359 


4 


360 


8 


I 362 


4 


363 


11 


364 


llql3 


365 


7q31 


366 


22ql3.31-13.3Z 


367 


5 


370 


19 


371 


7q31.1-7q31.33 


372 


2q37.3 


373 


3 


374 


16 


375 


19ql3.4 


376 


18ql2 


377 


18ql2 


379 


8 


380 


llql3 


381 


6 
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Table 7 



SEQ ID NO: 


Chromsomal location 


385 


4q28 


386 


15 


387 


10 


388 


17 


389 


llpl5.4 


390 


6p21.3 


391 


22ql3 


392 


3 


393 


19 


394 


15 


395 


1 


396 


6p21.2-p21.3 


397 


15 


399 


7q31 


400 


14 


! 402 


Xq28 


403 


10 


404 


16 


406 


16 


408 


11 


412 


20ql2-13.1 


413 


15 


414 


17 


415 


4 


416 


12q 


419 


21q22.1 


420 


16pll.2 


422 


6 


A1A 




426 


14 


428 


14 


429 


Iq22-q23 


430 


llql3 


431 


3 


432 


2 


433 


19ql3.1 


434 


20ql3.1 


435 


18q23 


436 


llq24 


437 


10 


438 


4q21-q25 
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Table 8 



CPA in 7VT/~^» -»|* I?. .11 K nff iL 

o&vi lu fNvJi oi r uii-iengtn 
iNixcicouae deque nee 


oHiV^ in iNti: oi ruil-iengtn 
ixucieouue sequence 


o£/v^ in iycj; in rrionty Application 

TT^JCM ft 0/774 ^911 


C.9 


C,9 
JZ 


j*f 


Jj 


c 7 

jj 


cc 

J J 


C.4 
J4 


C.A 
J4 


C£ 

jO 


JJ 


cc 
JJ 


J / 


jO 


JO 


CC 
JO 


C7 


C7 
J / 


CO 

jy 


jo 


CO 

JO 


<n 


r co" 

jy 


CO 

jy 


/Cl 

ol 


/CA 


/CA 

6U 


/CO 

02 


/C1 

ol 


/CI 
01 


/CO 

oi 


/CO 

oZ 


/CO 
6Z 


/C/1 

04 


/CI 


/CI 

63 


/CC 
OJ 


04 


64 


/C/C 

00 


/CC 

Oj 


/CC 

65 


/CO 

67 


/C*C 

66 


/C/C 

oo 


68 


/CT 

6/ 


/C7 
0/ 


/CA 

oy 


/CO 

Do 


/CO 

oo 


OA 

70 


/JO 

oy 


/CO 

oy 


/l 


*7A 


*7A 


/Z 


/l 


/l 


/i 


79 
/Z 


70 

/Z 


7/1 


79 
/i 


77 
/i 


7C 

/ J 


7vl 
/4 


9/1 
/4 


7/C 


7C 
/J 


7C 
/J 


T7 
// 


7/C 
/O 


7/S 

/o 


70 

/o 


77 


77 

/ / 


70 

/y 


70 

/o 


TO 
/o 


OA 

oU 


oo 


70 

/y 


O 1 

ol 


on 
ISU 


OA 

oU 


01 

oz 


0 1 
01 


0 1 

ol 


oi 


oZ 


HZ 


QA 

o4 


oi 


01 

oi 


o c 


0/1 

84 


84 


0/C 

oo 


DC 
Oj 


cc 

OJ 


OO 
0/ 


Q/c 
oo 


c/c 
SO 


OO 

oo 


o/ 


CO 
0/ 


on 


CO 
00 


00 

oo 


OA 

.yu 


CO 

oy 


CO 


Ol 

yl 


on 
yu 


on 
yu 


Ol 

yz 


91 


91 




92 


92 


94 


93 


93 


95 


94 


94 


96 


95 


95 


97 


96 


96 


98 


97 


97 . 


99 


98 


98 


100 


99 


99 


101 


100 


100 


102 


101 


101 


103 


102 


102 


104 


103 


103 


105 
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Table 8 



oJc^v iNvi; oi ruii-iengcn 
[Nucieotiae aequence 


Or?(~\ m JVJ/V /if Pull lanttth 

NJllAlAA^t/1 A C AST! ft An rf* A 

■ i>ucieonae oequence 


CT?/*4 TTI in T)vi strife r A nnli/tafiAti 

ojlv iiv/t in jrrioriiy Application 

TTQ^N 00/774 Oft 


1U4 


1 AA 
1U4 


iuo 


1UD 


1 fie 
IUj 


1U / 


i r\a 
106 


1 A£ 
1U0 


1 AQ 


1 A*7 
10/ 


1 A*7 
10/ 


1 AO 

iuy 


1 AO 

lOo 


1 AO 

lUo 


1 1A 
1 1U 


1 AA 

109 


1 AA 

109 


1 1 1 


1 1 A 

110 


1 1 A 

110 


1 1 o 
112 


111 

111 


111 

111 


117 

1 13 


112 


1 1 o 

112 


11/1 

1 14 


113 


113 


115 


1 1 A 

1 14 


114 


1 1/ 

116 


115 


t 1 c 

115 


117 


116 


1 16 


1 to 

1 lo 


1 17 


117 


1 1 A 

119 


t 1 o 

118 


i to 

118 


1 OA 

120 


1 1 A 

119 


1 1 A 

119 


121 


1 OA 

120 


1 OA 

120 


1 oo 

122 


121 


121 


123 


1 ti 
122 


122 


1 0/1 

124 


1 0*5 

123 


IOT 

123 


i o<: 
12j 


124 


1 O/l 

124 


1 0£ 

120 


1 o^ 
12 j 


i o< 


1 07 
12 / 


1 o/c 
120 


1 0£ 

120 


1 00 
125 


1 1*7 

12/ 


12/ 


1 OQ 


128 


no 

128 . 


1 1A 

13U 


129 


1 OA 

129 


1 o i 
131 


1 1A 
130 


13U 


1 10 

132 


in 

131 


in 
131 


133 


1 ^o 

132 


1 "JO 

13/ 


i i/i 
134 


133 


133 


13j 1 


1 1/1 

134 


1 1/1 

134 


1 1£ 
130 


13j 


133 


1 17 
13/ 


1 1/C 

136 


1 1/C 

136 


110 

13o 


1 1*7 

137 


1 lO 

137 


TIO 

13V 


1 lO 

138 


1 1C 

138 


14U 


139 


1 lO 

i3y 


141 


1 A A 

140 


14U 


1 /I0 

142 


! 141 


1/11 
141 


1/11 


i /to 
142 


1 /lO 

142 


1 AA 
144 






145 


144 


144 


146 


145 


145 


147 


146 


146 


148 


147 


147 


149 


148 


148 


150 


149 


149 


151 


150 


150 


152 


151 


151 


153 


152 


152 


154 


153 


153 


155 


154 


154 


156 


155 


155 


157 
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Table 8 



QT?f\ TTfe NTH* /\f 17i.ll Ion frill 

bit* vf iu in u . oi r un-iengin 


Ct*A Tf\ Kfn. nf on it* h 
1> UUlCUllUc OC(|UCIli.C 


QITft TT> NH* in Prinritv Annliratinn 
oinjj XL/ jpi vj. in i: rioniy /vppiiidiioii • 

TTCSN 09/774 528 


l JO 


156 


1 JO 


1 57 


K7 
U / 


1 59 


1 SR 

JL JO 


1 5R 

i JO 




1 

1 J7 


1 


161 


1 AH 


lOv 


lUx> 


101 


1 

101 


1 £.-1 
lOJ 


16Z 


lOz 


1 Oh 


1 £1 

loJ 


10J 


ltO 


1 &A 


104 


1 AA 
100 


1 <cc 
165 


165 


1 /C7 

10/ 


166 


166 


1 £0 

168 ! 


167 


167 


169 


168 


165 


1 *7A 

170 


169 


169 


171 


170 


t ta 

170 


172 


171 


I7l 


1 /J 


172 


172 


1 /4 


173 


173 


1 /5 


i n a 
174 


1 /4 


1 7/C 
1 /O 


175 


1 7^ 

1 /5 


1 n ~l 
1 / / 


176 


1 7/£ 
1 /O 


1 70 ! 

1 /5 


1 77 
1 // 


1 / / 


i /y 


1 70 


1 70 


i fin 

loU 


1 70 




1fi 1 
15 1 


loU 




1 R9 
10Z 


1 fil 
151 


1 R1 


1R1 

lOJ 


loZ 


1 R9 
loz 


Ifii 


lOJ 


1R^ 


1R^ 

lOJ 


1 fill 
154 


1 fid 
lot 


10/: 

lOO 


15J 


1 R5 


1 R7 
io / 


1 RA 
150 


1 R6 
1 oo 


1 RR 

1 00 


1 07 
15/ 


15 / 


1 fiQ 


1 oc 
155 


1 RR 
155 


ion 


loy 


1 CO 


1 Q1 

iy i 


1 OA 

19U 


i on 


1VZ 




1Q1 

iy 1 


1 01 

lyj 
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WHAT IS CLAIMED IS: 

1. An isolated polynucleotide comprising a nucleotide sequence selected from the 
group consisting of SEQ ID NO: 1-438, a mature protein coding portion of SEQ ID NO: 
1-438, an active domain coding portion of SEQ ID NO: 1-438, and complementary 
sequences thereof. 

2. An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide hybridizes to the polynucleotide of claim 1 under stringent 
hybridization conditions. 

3 . An isolated polynucleotide encoding a polypeptide with biological activity, 
wherein said polynucleotide has greater than about 90% sequence identity with the 
polynucleotide of claim 1. 

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA. 

5 . An isolated polynucleotide of claim 1 wherein said polynucleotide comprises the 
complementary sequences. 

6. A vector comprising the polynucleotide of claim 1 . 

7. An expression vector comprising the polynucleotide of claim 1 . 

8. A host cell genetically engineered to comprise the polynucleotide of claim 1 . 

9. A host cell genetically engineered to comprise the polynucleotide of claim 1 
operati vely associated with a regulatory sequence that modulates expression of the 
polynucleotide in the host cell. 

10. An isolated polypeptide, wherein the polypeptide is selected from the group 
consisting of: 
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(a) 



(b) 



a polypeptide encoded by any one of the polynucleotides of 
claim 1; and 

a polypeptide encoded by a polynucleotide hybridizing under 
stringent conditions with any one of SEQ ID NO: 1-438. 



11. A composition comprising the polypeptide of claim 1 0 and a carrier. 

12. An antibody directed against the polypeptide of claim 10. 

13. A method for detecting the polynucleotide of claim 1 in a sample, comprising: 

a) contacting the sample with a compound that binds to and forms a 
complex with the polynucleotide of claim 1 for a period sufficient to form the complex; 
and 

b) detecting the complex, so that if a complex is detected, the 
polynucleotide of claim 1 is detected. 

14. A method for detecting the polynucleotide of claim 1 in a sample, comprising: . 

a) contacting the sample under stringent hybridization conditions 
with nucleic acid primers that anneal to the polynucleotide of claim 1 under such 
conditions; 

b) amplifying a product comprising at least a portion of the 
polynucleotide of claim 1 ; and 

c) detecting said product and thereby the polynucleotide, of claim 1 in 

the sample. 

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and the 
method further comprises reverse transcribing an annealed RNA molecule into a cDNA 
polynucleotide. 

16. A method for detecting the polypeptide of claim 1 0 in a sample, comprising: 
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a) contacting the sample with a compound that binds to and forms a 
complex with the polypeptide under conditions and for a period sufficient to form the 
complex; and 

b) detecting formation of the complex, so that if a complex formation 
is detected, the polypeptide of claim 10 is detected. 

17. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10 under 
conditions sufficient to form a polypeptide/compound complex; and 

b) detecting the complex, so that if the polypeptide/compound 
complex is detected, a compound that binds to the polypeptide of claim 10 is identified. 

18. A method for identifying a compound that binds to the polypeptide of claim 10, 
comprising: 

a) contacting the compound with the polypeptide of claim 10, in a 
cell, under conditions sufficient to form a polypeptide/compound complex, wherein the 
complex drives expression of a reporter gene sequence in the cell; and 

b) detecting the complex by detecting reporter gene sequence 
expression, so that if the polypeptide/compound complex is detected, a compound that 
binds to the polypeptide of claim 10 is identified. 

19. A method of producing the polypeptide of claim 10, comprising, 

a) culturing a host cell comprising a polynucleotide sequence selected 
from SEQ ID NO: 1-438, a mature protein coding portion of SEQ ID NO: 1-438, an 
active domain coding portion of SEQ ID NO: 1-438, complementary sequences thereof 
and a polynucleotide sequence hybridizing under stringent conditions to SEQ ID NO: 1- 
438, under conditions sufficient to express the polypeptide in said cell; and 

b) isolating the polypeptide from the cell culture or cells of step (a) . 
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20. An isolated polypeptide comprising an amino acid sequence selected from the 
group consisting of any one of the polypeptides encoded by SEQ ED NO: 1-438, the 
mature protein portion thereof, or the active domain thereof. 

21. The polypeptide of claim 20 wherein the polypeptide is provided on a polypeptide 
array. 

22. A collection of polynucleotides, wherein the collection comprising the sequence 
information of at least one of SEQ ID NO: 1-438. 

23. The collection of claim 22, wherein the collection is provided on a nucleic acid 
array. 

24. The collection of claim 23, wherein the array detects full-matches to any one of 
the polynucleotides in the collection. 

25. The collection of claim 23, wherein the array detects mismatches to any one of 
the polynucleotides in the collection. 

26. The collection of claim 22, wherein the collection is provided in a computer- 
readable format. 

27. A method of treatment comprising administering to a mammjilian subject in need 
thereof a therapeutic amount of a composition comprising a polypeptide of claim 10 or 20 
and a pharmaceutical^ acceptable carrier. 

28. A method of treatment comprising administering to a mammalian subject in need 
thereof a therapeutic amount of a composition comprising an antibody that specifically 
binds to a polypeptide of claim 10 or 20 and a pharmaceutical^ acceptable carrier. 
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The DNA of SEQ ID NO: 1-438 are different in structure and encode polypeptides having different structure and different function or 
substrate specificity. Therefore, in addition to electing one Group , applicants must further elect one DNA sequence or one 
polypeptide sequence encoded by SEQ ID NO: 1-438. 

The technical feature linking Groups I- VIII appears to be that they all relate to the DNA of SEQ ID NO: 1-438. 
However, Dumas et al. teach a polypeptide encoded by a polynucleotide that is 99% identical to SEQ ID NO:231. 

Therefore, the technical feature linking the inventions of Groups I-X does not constitute a special technical feature as defined by PCT 
Rule 1 3.2, as it does not define a contribution over the prior art. 

Groups I-in do not share a technical feature because a DNA, a protein, and an antibody are different compounds, each with its own 
chemical structure and function, and they have different utilities. The DNA molecule of Group I is not limited in use to the 
production of polypeptide of Gruoup II and can be used as a hybridization probe, and protein of Group II can be obtained by a 
materially different method such as by biochemical purification. The structure of an antibody of Group in is not predictable from the 
structure of the protein of Group II and an antibody can cross-react with various proteins. 

The special technical feature of Group I is a DNA of SEQ ID NO: 1-438, vector comprising said DNA, host cell comprising said 
DNA and a method of producing polypeptides. 

The special technical feature of Group II is a polypeptide encoded by the DNA of Group I. 
The special technical feature of Group m is an antibody against the protein of Group II. 
The special technical feature of Group IV is a a method of detecting the DNA of Group I. 
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The special technical feature of Group V is a a method of detecting the polypeptide of Group II. 

The special technical feature of Group VI is a a method of identifying a compound that bind to the polypeptide of Group II. 

The special technical feature of Group VII is a a method of treatment using the polypeptide of Group n. 

The special technical feature of Group VIII is a a method of treatment using the antibody of Group EH. 

Accordingly, Groups 1-X are not so linked by the same or a corresponding special technical feature as to form a single general 
inventive concept. 
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